Goodbye MySQL, Hello PostgreSQL, Part 2

Liferay has delivered on the promise to release tooling to help move from MySQL to PostgreSQL. Read on to learn how you to can use the tool to migrate your own environment.

Caption

Introduction

In my first post, Goodbye MySQL, Hello PostgreSQL, I shared that after a lot of practical research, Liferay had decided to move their cloud platforms off of MySQL and onto PostgreSQL for performance reasons.

I also indicated that Liferay was working on tooling to support this migration, but at the time it was not available so that's all I wrote about it.

Well, since Liferay DXP 2024.Q4 and Liferay Portal CE GA129, there's a new Control Panel and tool to help you complete this migration too. This blog post introduces the tool and explains how to use it...

Using the Tooling

So to use the new tool is actually a two-step process...

Step 1 - Export the Schema

So this step is completed in the System Settings Control Panel. Under the Platform section, look for Upgrades:

Within the control panel, you may want to check out the other available options, but we're going to be focusing on just the Database Schema Definition Exporter:

From the Database Type dropdown, select Postgresql.

Note: You may be tempted to try the Mysql option, but don't. It's not actually a complete implementation, so you can't convert to Mysql even though the option is there. I think it is just for future use, but at this point it is not completely functional.

Enter the path where the schema files should be generated, then click on the Save button to save the schema files.

When you do this, you'll get a little green toast in the lower-left corner indicating everything was successful, plus you'll get a couple of log messages:

2025-02-27 20:28:04.377 INFO  [CM Event Dispatcher (Fire ConfigurationEvent: pid=
  com.liferay.portal.db.schema.definition.internal.configuration.
  DBSchemaDefinitionExporterConfiguration)][DBSchemaDefinitionExporter:103] 
  Start database schema definition export
2025-02-27 20:28:04.635 INFO  [CM Event Dispatcher (Fire ConfigurationEvent: pid=
  com.liferay.portal.db.schema.definition.internal.configuration.
  DBSchemaDefinitionExporterConfiguration)][DBSchemaDefinitionExporter:125] 
  Finished database schema definition export to 
  /users/dnebinger/liferay/clients/dnebing/blogs

When you check this directory, you'll have 3 files created there:

  • tables.sql - This file has the table creation script.
  • indexes.sql - This file has the index creation script.
  • db_schema_definition_export_report.txt - This is the report generated during the export.

You're going to want to review the export report file and make sure there are no errors. Any errors indicated in this file are precursors to migration problems that could arise later.

You can check the tables.sql and indexes.sql files if you'd like, but this is typically not necessary. Just note that the files have the correctly named tables, columns and indexes, columns will have the right datatypes, etc. so when the later data import is processed, the resulting PostgreSQL database will be as ready as it can be for your Liferay environment.

Before moving onto Step 2, you need to shut down the Liferay node(s) that are connected to the database. The data migration may not be successful or accurate if there is a running Liferay node that is performing data updates while the migration is executing.

Step 2 - Run the DB Schema Importer Script

Now that the Liferay node(s) are stopped and the schema has been exported, we're now ready to move onto the next step which is executing the DB Schema Importer script.

In the tools directory under $LIFERAY_HOME, you'll find the new tool in the portal-tools-db-schema-importer directory.

In this directory is a jar file (with the implementation code) and the db_schema_importer.sh execution script.

Sorry my dear Windows brethren but as of yet there is no db_schema_importer.bat script. I've already posted to the team asking where it is, so it may turn up in a future release. As I see it, for the time being you have three choices: 1) Upgrade to Linux ;-), 2) Try the Windows Subsystem for Linux (WSL), although I haven't tried it and don't know if it works, or 3) just look at the shell script and execute the java command manually (it's pretty easy to do, the script is not at all complicated).

If you run the script with no arguments, you will see the usage details:

$ ./db_schema_importer.sh
Missing required options: path, source-jdbc-url, source-password, source-user, 
  target-jdbc-url, target-password, target-user
usage: Liferay Portal Tools Database Schema Importer
    --help                    Print help message.
    --jdbc-batch-size <arg>   Set the JDBC batch size. The default value
                              is 2500.
    --jdbc-fetch-size <arg>   Set the JDBC result set fetch size. The
                              default value is 2500.
    --path <arg>              Set the path of the source SQL files.
    --source-jdbc-url <arg>   Set the source JDBC URL.
    --source-password <arg>   Set the source database user password.
    --source-user <arg>       Set the source database user.
    --target-jdbc-url <arg>   Set the target JDBC URL.
    --target-password <arg>   Set the target database user password.
    --target-user <arg>       Set the target database user.

So the arguments should be fairly easy to understand, but I do want to call some out...

The JDBC batch and fetch size determine the number of records to write (in a single transaction) and read. Changing these sizes will affect the runtime, but may impose additional memory requirements for the java process as well as the source or target databases.

The path argument is the same value that you used in the Control Panel, it points to where the sql files are located. From my example above, I'd be using the value /users/dnebinger/liferay/clients/dnebing/blogs for my run.

The Source parameters are for your Mysql database. Now you might look at this and say "Hey, maybe I could use HSQL or SQL Server or [enter other supported DB here] instead of Mysql..." and all I can say is Maybe. I mean, I haven't tried another database and I haven't seen the code to know if it is really dependent on Mysql... That said, maybe it works, maybe it doesn't. If you do get it working on a different database, leave a comment below. My own guess is since there is no parameter for the source or target driver classes, there might not be as much flexibility here as we might hope.

Since we're moving from MySQL and Liferay is already using it, you can typically find your MySQL details in either portal-ext.properties or in the JNDI configuration for your app server.

The Target parameters are for your PostgreSQL database. Note that your database should exist before you try to import the schema. As a refresher, to create a database named lportal for Liferay, you'd execute the following commands at the psql prompt:

create role liferay with login password 'lportal';
alter role liferay createdb;
alter role liferay superuser;
create database lportal with owner 'liferay' encoding 'UTF8' 
  LC_COLLATE='en_US.UTF-8' LC_CTYPE='en_US.UTF-8' template template0;
grant all privileges on database lportal to liferay;

So here we've created a new user, liferay, with password lportal, created the lportal database using the UTF-8 encoding, and we've granted the liferay user the ability to manage the lportal database.

Please please please do not use these credentials for your own database. They are simply here to make it easy to blog about, but you should use unique values that are appropriately secured per your organizational policies.

Using these details, you have your target username liferay and password lportal, and the JDBC URL would be jdbc:postgresql://dbserver:5432/lportal.

Okay, so we've reviewed the parameters and have the target PostgreSQL database ready, so we can go ahead and run the tool:

$ ./db_schema_importer.sh --path /users/dnebinger/liferay/clients/dnebing/blogs 
  --source-jdbc-url jdbc:mysql://mysqlserver:3306/lportal --source-user mysqlroot 
  --source-password mysqlpswd 
  --target-jdbc-url jdbc:postgresql://dbserver:5432/lportal --target-user liferay 
  --target-password lportal

The process will take an unknown amount of time to run.

Factors that could affect the runtime:

  • Size of the database.
  • Changed values for the batch and fetch size arguments.
  • The source database, MySQL, is only going to be read from. This means you could tweak the configuration so the MySQL server is running in a read-only configuration and improve the read performance.
  • The target database, PostgreSQL, is only going to be written to, there are no reads. This means you could tweak the configuration so PostgreSQL optimizes write performance.

If you do apply the last two options, be sure the undo the configuration changes after migration is complete.

When the script is done, there will be a new file in the directory titled db_schema_import_report.info. You're going to want to review this file as it is a report of the migration outcome. Obviously you are looking for a successful outcome; if you have a failure, you're going to want to drop the PostgreSQL database, recreate it, and then fix the source of the errors (whether data, code, whatever) before trying the migration again.

Post Migration Activities

Yeah, that's right, the migration is done, there's just a few things left to do.

First, of course, you have to configure the Liferay node(s) to point at the new PostgreSQL database. You'll do this either in your portal-ext.properties file or in the JNDI configuration for your application server.

For the properties option, your new properties would be:

jdbc.default.driverClassName=org.postgresql.Driver
jdbc.default.url=jdbc:postgresql://dbserver:5432/lportal
jdbc.default.username=liferay
jdbc.default.password=lportal

Make sure that all of the nodes are updated to point at PostgreSQL; you don't want some talking to PostgreSQL and others talking to the old MySQL database, that won't end well.

Reindexing will not be necessary when you start up the nodes. After all, the database should be a perfect clone of the old one, it's just running on a different platform.

Start your first Liferay node up and make sure that it connects to the database cleanly and you get the expected startup messages (i.e. if you were getting stack traces when starting against MySQL, the same ones should happen with PostgreSQL since it doesn't magically fix your startup).

Connect to the instance and verify your site(s) are operating the same as they were before the migration.

Start the remainder of your nodes because you're now back in business.

Best Practices for Migration

So you can use what I've shared so far and probably be pretty successful, but I wanted to share some best practices to help improve the whole process...

First, the migration time is going to very much be determined by the size of the existing MySQL database. Any steps you can take to reduce the size of the database will improve the migration time. This could include deleting things you are no longer using, cleaning up old versions in documents, etc.

Second, practice makes perfect. Start by doing the migration on your lower lanes (development or test servers), and "promote" the migration to the higher lanes on each success. By the time you're doing the production instance, you will know what to expect and there will be no surprises. Additionally you'll be able to get migration time estimates so you'll know how long production will take to process.

Third, process this like a project. Have a plan. Have a project manager. Track the progress as you go through the different environments.

Fourth, take a PostgreSQL backup post migration but before Liferay launch. Backups represent a fallback restore point should something go awry.

Fifth, stop your MySQL database before launching the Liferay node(s). This will help to ensure you don't end up with a node that was missed for the update and still trying to use the MySQL database. Note I'm just saying stop it, not saying drop it or anything. Keep it around (again as a fallback point), just ensure the Liferay node(s) cannot possibly connect and use it.

Sixth, use the default batch and fetch sizes and try with different sizes to identify what works best for your databases and data. Time reduction is the goal, so find the best sizes that lead to the best results.

Seventh, if you find the runtimes excessive in the lower lanes due to the volumes of data you have, you might test configuring MySQL to be in read-only mode and/or PostgreSQL to optimize writes while the migration process is running. I wouldn't recommend this if your migration times are reasonable since you have to remember to restore the original configuration after the migration has completed. For excessive runtimes, changing the configuration this way could make a significant difference.

Optimizing PostgreSQL for writes would use the following configuration for PostgreSQL:

max_connections = 100
shared_buffers = 1024MB
synchronous_commit = off
wal_writer_delay = 1000ms               
wal_writer_flush_after = 10MB 
max_wal_size = 1GB
min_wal_size = 80MB

Finally, remember that the foundation of Liferay is its database. If at any point during the migration you do not have complete success, don't ignore the errors. They could represent future stability or data problems for the platform.

Conclusion

So we've reviewed the Liferay tool for migrating from MySQL to PostgreSQL, what we needed to do, processes, etc.

Some might ask, "Why do it this way?" And I get it. MySQL Workbench and other tools are capable of migrating databases from one platform to another with simple user interfaces. So why use the Liferay tool which is clearly more involved?

For me, it comes down to Liferay expectations. For example, Liferay might expect a certain column in a table to be a CHAR(75) column that supports UTF-8, but the tools that do the migration might not make the right datatype choice when trying to create the PostgreSQL version. Since Liferay is generating the tables and indexes scripts, it is going to be creating the exact types, names, sizes, etc of everything in the PostgreSQL database.

And with my comment above about the database being the foundation of Liferay, letting Liferay control all of that creation in PostgreSQL will ensure that the foundation remains solid.

Plus, since this is a tool provided by Liferay, all of you DXP clients get support for the migration, which is not something you can say about any of the other tools.

So yeah, that's all I've got.

Hopefully if you've been considering migrating from MySQL to PostgreSQL, you can use this blog to empower yourself to get it done.

If you have any questions or problems, hit me up in the comments below, on the Liferay Community Slack, or drop in on one of my Office Hours sessions.