I have been recently working on the very interesting task of making the Liferay database shardable. What does this mean? There is a good introduction to shards here, but basically it is a way of scaling your database horizontally. For a given table or set of tables, you split up the data that is stored and fetched based on a given hash or something like that. Instead of storing a gigabyte of user data on one database, you spread this data across.. let's say, a dozen shards. This way you don't overload one database, have smaller queries (since each table has less data now), and have better overall throughput under load because all your IO is not going through one database server. Google, Facebook, Wikipedia -- all these guys shard their databases.
Nice concept, but how does it work in Liferay? Well, as sort of a quick initial use case, I tried to shard the system based on portal Instances. This is great if you are working in an ASP (Application Service Provider) environment where you host multiple portals on a single running version of Liferay. So, in a nutshell, what we do is we use Spring to intercept calls to the persistence tier, figure out what the companyId is (that which designates the Portal Instance), and switch out the DataSource underneath you before letting you continue. When there is a need to make calls across all databases (e.g., UpgradeProcess), we catch it in Spring and make the appropriate call multiple times across each of the shards. With that, you now have a sharded database!
In the future, we are looking to shard other things... Users, Communities, Layouts, Portlets, etc. But for now, this is our first cut.
If you would like to try it out yourself, check out the wiki article.

