Goodbye MySQL, Hello PostgreSQL!
Liferay is Moving from MySQL to PostgreSQL
Goodbye MySQL? No, don't worry, Liferay isn't dropping support for MySQL, don't start panicking just yet.
Liferay has been transitioning new PaaS and new/existing SaaS systems from MySQL to PostgreSQL 16, so the cloud-based systems are saying goodbye to MySQL and saying hello to PostgreSQL...
This blog post isn't really about announcing the change, I believe that has already been done in other channels.
This blog post is all about the why it is being done.
Because the why is important, right? I mean, if you're about to stand up an on-prem Liferay environment or your own hosted cloud environment, one of the choices you're going to have to make is what database should you host Liferay on.
And if you're already running Liferay on one database (such as MySQL), you might not have given any thought to moving to another database and whether there might be advantages in undertaking that effort...
So to me, the why behind the move is interesting and definitely blog-worthy.
The why boils down to one word: Performance.
Well, actually there's a second word: Scalability.
Oh, and a third phrase: Resource (CPU & Memory) Management.
Those big words hide a lot of the details, so let's start peeling away the layers of this onion and see what we find.
To evaluate MySQL vs PostgreSQL, Liferay had to create a load test to get some numbers.
The methodology of the tests is similar with what Liferay does for the performance whitepapers. A throughput-oriented testing tool was used to emulate tens of thousands of virtual online users. These online users are emulated to have "sleep" time to reflect the real-world users reading the articles/posts on the page, typing in the comment or new-post text box, during which no request will be made. Since the sleep time is fixed between each operation, as defined by the test case, the more online users can exist, the more throughput the portal can hold. The number of the online users are used as the result number.
In the tests, "SampleSQLBuilder" was used to pre-generate SQL files to be inserted into the database. This is because it was too slow to generate e.g. 100k users or 10k MB posts by calling portal's API...
In the first round of the comparison tests, Liferay found that
Regarding the bad result with MySQL, Liferay noticed that the generated test data using SampleSQLBuilder had several issues:
These issues effectively confused MySQL's query optimizer, as the indexes on the user or date columns can't reduce any data, causing MySQL to create a really bad query plan involving irrelevant indexes and back-to-table-queries. On the other hand, PSQL just uses in-memory full-table scanning, which was much faster and lighter.
To resolve this, Liferay made a portal side optimization about how ServiceBuilder generates indexes for finders and also modified SampleSQLBuilder to fix the 2 issues. With all the optimizations, in the Message Boards test case,
In general, PSQL did show better CPU consumption in the tests compared to MySQL.
It's important to note that in the database setups, the MySQL server's configuration was fine-tuned, while PSQL was using the out-of-box defaults. This implies that if the team had spent some time on the PSQL tuning, the results could have been [perhaps significantly] improved.
To go back to the Why though, the test showed in general that:
So these improvements, when multiplied by the number of environments Liferay is managing in PaaS and SaaS, translate into significant cost savings for Liferay while at the same time slightly improving performance for individual environments.
So, after this testing and the experience from monitoring all of the Liferay SaaS environments (which were exclusively MySQL), Liferay decided to make its default database in PaaS and SaaS to be PostgreSQL.
Had they asked me originally, I would have pointed them towards PostgreSQL over MySQL. I've always found PostgreSQL to be better for Liferay.
And if you research PostgreSQL vs MySQL performance and have some understanding about how Liferay data is structured (I'm not guiding you to tear into the Liferay data model here), it kind of makes sense.
Pretty much every online review will make a general recommendation like the one I found here:
When you combine this with a bit of understanding about the Liferay data model, how everything is an asset, all of the cross-table linking not maintained by managed foreign keys, etc., then you would start to classify Liferay as having complex queries and in many organizations there are large datasets (Liferay in fact is designed to support large dataset situations).
So yeah, when they found that PostgreSQL was better for Liferay than MySQL, I wasn't surprised.
The only question left to answer is "What should you do?"
If you're already on PostgreSQL, great choice, you don't have to do anything.
If you don't have a Liferay environment already and you're facing the choice, choose PostgreSQL if you can. By that I mean if it is truly a choice, i.e. you have experience with both database systems, you have resources that can install, admin and manage both systems, then choose PostgreSQL.
If you're on MySQL, though, do you stay there or do you change to PostgreSQL?
To me, it always comes down to what do you know, or what do you have resources for...
If you're a MySQL shop, you have no PostgreSQL installations, you've never used PostgreSQL, you don't have PostgreSQL administrators, etc., the pain you'll face trying to use PostgreSQL for Liferay just to get a bit of a performance gain may not be worth it.
If it truly is a choice, meaning you have experience or administrators for both platforms, the next question is how the ROI measures up. Migrating from MySQL to PostgreSQL is not going to be free, there's definitely going to be a cost there in time, money and resources. The return though is going to be on the performance impact to the site which is going to be rather arbitrary. Only you can know how your site would perform in PostgreSQL vs MySQL, only you can determine if there is a benefit in changing, and only you can evaluate what the performance gain might be worth. So only you can determine if there would be an ROI on the change and whether that is going to be greater than the cost to switch.
So generally, I wouldn't think any existing site should really consider switching from MySQL to PostgreSQL. While I can't know your context, my own impression is that the ROI is generally just not going to be there.
For Liferay though, for hosting as many instances as they do in SaaS and PaaS, the ROI is completely clear and significantly covers the costs they'll face in moving environments from MySQL to PostgreSQL, so it is easy to justify.
I just think that few other implementations are going to be able to demonstrate that as clearly.
As a last word, though, I'll just go back to what I've already said - if you have a choice, go with PostgreSQL.