Blogs
Two approaches with the same goal
Hi Liferay Community,
Before responding this question I would like to explain what's sharding first: to overcome the horizontal scalability concerns of open source databases at the time (circa 2008), Liferay implemented physical partitioning support. The solution allowed administrators to configure portal instances to be stored in different database instances and database server processes.
This feature was originally named "sharding" although "data partitioning" is more accurate since it requires a small amount of information sharing to coordinate partitions.
Thus, beginning in 7.0, Liferay removed its own physical partitioning implementation in favor of the capabilities provided natively by database vendors. Please, notice that logical partitioning via the "portal instance" concept (logical set of data grouped by the companyId column with data security at portal level) is not affected by this change and it's available in current Liferay versions.
Having explained this, the answer to this question is simple, just the follow the official procedure to do it:
https://dev.liferay.com/discover/deployment/-/knowledge_base/7-0/upgrading-sharded-environment
So Liferay 7.x provides a process which will convert all shards in independent database schemas after the upgrade. This can be suitable for thoses cases where you need to keep information separated for legal reasons. However if you can not afford to maintain one complete environment for every of those independent databases you could try another approach: disable staging by merging all shards into just one database schema before performing the upgrade to Liferay 7.x.
The option of merging all shard schemas into the default one is feasible because sharding generates unique ids per every row among all databases. These are the steps you should follow to achieve this:
- Create a backup for the shard database schemas in the production environment.
- Copy the content of every table in the non default shards into the default shard. It's recommended to create an SQL script to automate this process.
- If a unique index is violated, analyze the data for the two records which cause the issue and remove one of them since it's not necessary anymore (different reasons could cause the creation of data in the incorrect shard in the past such as wrong configuration, a bug, issues with custom developments, etc.)
- Resume this process from the last point of failure.
- Repeat 3 and 4 until the default shard database contains all data from the other shards.
- Clean up the Shard table except for the default shard record.
- Startup a Liferay server using this database without the sharding portal.properties:
- Remove all database connections except for the default one.
- Comment the line META-INF/shard-data-source-spring.xml in the spring.configs property.
- Ensure that everything works well and you can access to the different portal instances.
It is recommended that you keep record of the changes made in the step 3 and 6 since you will need to repeat this process once you decide to go live after merging all databases in the default shard. It is also advisable to do this as a separate project before performing the upgrade to Liferay 7.x. Once you have completed this process you will just need to execute the upgrade as a regular non-shared environment:
https://dev.liferay.com/en/discover/deployment/-/knowledge_base/7-1/upgrading-to-liferay-71
This alternative to upgrade sharded environments is not officially supported but it has been executed succesfully in a couple of installations. For that reason, if you have any question regarding it please write a comment in the this blog entry or open a new thread in the community forums, other members of the community and I will try to assist you during this process.

