Unable to get default company ID Exceptions

There's lots of ways to get here, but may not be clear how you did...

Introduction

Recently I've been helping folks on the community slack and forums who have faced a common issue. At startup, they reach an exception, java.lang.IllegalStateException: Unable to get default company ID and then everything grinds to a halt.

One thing they all have in common is they are using either 7.4 CE GA77 or later or 7.4 DXP U77 or later.

Another thing they have in common, they have set the company.default.web.id in the portal-ext.properties file to any value that is not liferay.com.

After collecting details around the failures, I started opening tickets on issues.liferay.com (now liferay.atlassian.net after the move to Jira Cloud).

There were some true bugs identified and fixed as of GA82 and U82.

And yet, some users are still getting the exceptions, so I thought it was important to talk about what was changed, what was fixed, why you can still get these exceptions and, more importantly, what you can do to not get these exceptions.

So, what happened...

A change was introduced in GA77/U77 dealing with the company.default.web.id property. In GA76/U76 and earlier, the company.default.web.id property was only used the first time the portal was started.

And, when the portal is first started, we know how it connects to the database, creates all of the tables and populates it with data.

One piece that it populated was the creation of the default Company record, or more appropriately referred to as the default Instance, and the web id for this instance was set using the value of the company.default.web.id property (which, unless you override it, is in fact "liferay.com"). Although many of us likely only use a single instance in our environments, Liferay does support hosting multiple instances in a single Liferay cluster.

FYI, the value of an instance (versus a Site, for example) is the virtual wall that exists between instances. Each instance has its own set of users, own set of groups, roles, etc. It can have its own authentication configuration that is completely separate from other instances, instance configuration can be completely different, and there's so much more than what can be quickly listed here.

When an instance has multiple sites, the sites share the same set of users/groups/roles, they can also share content, templates, fragments, etc.

But instances don't share, there's a virtual wall between them which users and data do not cross. So I could create a Coca-Cola™ instance and also a Pepsi™ instance, and even though these are two competitors, the virtual instance wall will keep their data separate and protected from each other.

So you might consider taking advantage of this to host your www brochure site on one instance, but then host your extranet in another instance. You're using a single Liferay cluster, you have multiple instances each with their own users and authentication mechanisms, and you can't bleed content from your sensitive extranet instance in your public brochure instance.

In GA76/U76 and earlier, that was the end of the use for the company.default.web.id property. After first startup and the creation of the default instance, the property was ignored, you could change it all day long and Liferay didn't care.

And this actually was seen as a bug. It meant that a Liferay administrator could not make a new instance and use that one as the default, the first instance created at the first startup was forever going to be the default instance.

And so, in GA77/U77 this bug was fixed. Whatever web id was assigned in the company.default.web.id property would be treated as the default instance. A change to the property value would basically mean changing the default instance.

Wahoo, bug fixed!

We Have Exceptions!

After the release of GA77/U77 we started receiving reports of failures logging into new environments and, after community involvement and some analysis, we figured out how to reproduce the issue and shared it with Liferay on LPS-187661. Long story short, if you used anything except liferay.com as the admin email address in the setup wizard, it would create the user account but you couldn't use it to log in.

Now, this was a real bug and, although maybe not directly tied to the company.default.web.id, it was somewhat related.

I don't have a clear picture of how it is related even after looking at the pull request that fixed the issue, but it certainly seems related since the change for handling the company.default.web.id happened at the same time this new failure started...

So this bug was fixed and released in GA82/U82.

Yay Team!

Still Happening?

Recently though there were more reports of this same exception still happening, even after trying on 82, 83 and 84.

So surely there is something still causing problems...

Well, I understand the problems now, and it relates to the change for how the company.default.web.id property works. Let me introduce the new rule:

The rule behind the company.default.web.id is that it must be set to the web id of an existing instance. The only exception to this rule is the first time startup where, of course, the new instance will be created and the web id assigned to the value of the company.default.web.id, remembering that "liferay.com" is the default value.

Okay, so this is the rule and, if you break the rule, you'll know because you'll see the exception reported as java.lang.IllegalStateException: Unable to get default company ID.

So there were two recent reports of this happening, but in each case they had basically broken the rules.

First case was an upgrade. The upgrade completed successfully to U84, but when the environment started it reported the exception and bailed.

This can happen by a couple of different ways:

  • The pre-upgrade environment was created without a setting for company.default.web.id (so the default liferay.com was used) but added later since non-Liferay orgs don't want to use liferay.com as the web id. However, since the older versions of Liferay ignored the property, the old version would start successfully, but under the new version, it breaks the rule because the property doesn't match an existing company web id.

  • The pre-upgrade environment was created with the custom setting for company.default.web.id, but the configuration of the upgrade tool properties did not include the company.default.web.id property and it updated the current Company instance to use the default liferay.com value. When the new environment started with the company.default.web.id set to the expected value in portal-ext.properties, it didn't match an existing company web id.

The second case was a little more nefarious. An official Liferay docker U84 image was started using a portal-ext.properties value which only contained a setting for the company.default.web.id and nothing else. When the container started, it threw the same exception.

How did this user violate the rules?

So the docker images contain a pre-populated HSQL database and it has been created using the default liferay.com web id. The DB is pre-populated to speed up the launch of the container when doing simple demos.

But, in the scenario outlined, the liferay.com web id did not match the property provided in portal-ext.properties. Now, had the portal-ext.properties file included JDBC properties to point to a new database, well then this would have proceeded under the exception to the rule as it created and populated the tables and used the provided company.default.web.id value when creating the instance. Another option, if a different data directory was configured, Liferay wouldn't have used the pre-populated HSQL database, it would have created a new database and populated it with the value from the property.

So these two cases felt like bugs, but actually ended up just being the results of breaking the rules.

How To Avoid This Exception

So basically it comes down to following the rules...

First, set the company.default.web.id before you first launch Liferay if you really want something other than liferay.com as the web id.

Now, if you can't do that, log into Liferay and create a new Virtual Instance and assign it the web id you want. Then you can change the company.default.web.id over to the new, existing web id without any problem.

Next, if you're going to be using docker and the HSQL database, just don't play with the company.default.web.id, either that or point docker at an external database or mount an empty directory for the HSQL database (so it gets recreated).

And finally, and probably one of the more important ones, if you are doing an upgrade, be sure to copy all of your properties from your portal-ext.properties file into the db upgrade tools portal-upgrade-ext.properties file. This should happen whether you have a custom value for company.default.web.id or not, but it is critical that you include this property during upgrades when you do have a custom value.