RE: Storing large data in SingleVMpool Cache

Anji E, modified 6 Years ago. Junior Member Posts: 49 Join Date: 11/18/14 Recent Posts

We have requirement where in need to call a third party service which returs large custormer data.  which we need to process and display in portlets.

We need to call this service in many portlets. In order to avoid multiple calls we are planning to store the response in singleVMpool cache.

We receive json data from thrid party api , which we parse and store into a list and that list will be placed into singleVMpool cache.

 

Below are our concerns.

1. Is there any limit to store the data in singleVMpool cache.

2. Our Production evnviroment is clustered, hence both the node has to maintain the data in respective VM. Is it correct ?

3. Will it impact any space or data loss or any permformance issue in logn run.

 

Thanks in advance.

 

thumbnail
David H Nebinger, modified 6 Years ago. Liferay Legend Posts: 14933 Join Date: 9/2/06 Recent Posts

Single VM pool is like a big cache.  It can hold the data you pull, but as you pointed out it will consume resources while it is held in memory. And, in your cluster, each node will manage the single vm pool and each will be responsible for managing their own pool separately.

Multi VM pool is the clustered variety; basically one node does the managing but all changes are broadcast across the cluster.

Before you go down this road, however, you should consider alternatives.

First, actually measure the time it takes to complete the API call; if it is not time consuming, you might be making your life harder trying to cache and maintain the data.

Second, determine how much of the data is actively used. For example, if you are pulling 10k records but then the portlets are only actively displaying a few records at a time, it may actually be better to build a table to store the data in the database. Use a scheduled job to pull and update the database, then let the portlets use the local service to grab the data you need.

The benefits for the last option is a) it is cluster friendly without actively consuming memory, b) records pulled from the database by the portlets are auto-cached by ehcache (so you get the auto performance improvement without building anything manually), and c) from a portlet perspective the portlets will be coded more or less like any other standard Liferay implementation, so you don't need any special knowledge to implement it.

Anji E, modified 6 Years ago. Junior Member Posts: 49 Join Date: 11/18/14 Recent Posts

Hello David ,

Thanks a lot for the reply and valid suggesion.

 

Intially we have planned the same approach that you have suggested . i.e Storing all data from third party api into Liferay table and retreiving the data using service builder in  many portlets.

Also written a scheduler to syc the data once or twice in a day. 

But the data from third party api changes frequently (ex Hourly) and this data has to reflect then and there only as it contains price details etc ..

 

Hence we have decided to implement the cache mechanism.

 

Problem with SingleVMCache.

We have 2 portlets in our project Portlet-A and Portelt-B.

1. Portlet-A invokes the third party api puts all data into Cache. --> Working

2. Portlet-A try to access the data from Cache --> Working.

3. Portlet-B try to access the data from cache -> Not working --> Getting java.lang.ClassCastException:.

 

i.e Data set in one portlet can not read/access the value in another portlet.

Is there any way to make the cache available across the porltet.

 

Note : I have written the code to cache data in portlet not in the service builder  . Is this the reason why I am not able to share the cache data between the portlets.

 

thumbnail
David H Nebinger, modified 6 Years ago. Liferay Legend Posts: 14933 Join Date: 9/2/06 Recent Posts

Updating the database hourly should not be an issue; when pushed data into SB table, the node will broadcast the update across the cluster so the other nodes w/ cached data will purge the stale data from the cache. Unless you are processing a ton of data, the update itself should complete in a matter of seconds (single digit seconds). The only way you'll know is to time the process to determine just what the "latency" might be.

As to the class loader issue, try to use class(es) from outside portlet A and B that are pulled in from, say, a shared module. The CCEs normally come when you are using a class that may be separately loaded from two different class loaders.

 

Anji E, modified 6 Years ago. Junior Member Posts: 49 Join Date: 11/18/14 Recent Posts
David H Nebinger:

Updating the database hourly should not be an issue; when pushed data into SB table, the node will broadcast the update across the cluster so the other nodes w/ cached data will purge the stale data from the cache. Unless you are processing a ton of data, the update itself should complete in a matter of seconds (single digit seconds). The only way you'll know is to time the process to determine just what the "latency" might be.

As to the class loader issue, try to use class(es) from outside portlet A and B that are pulled in from, say, a shared module. The CCEs normally come when you are using a class that may be separately loaded from two different class loaders.

 

Hello David,

Thanks a lot for your reply.

As per your suggesion i will check the "latency" and minimum frequency of data changing . Based on these criteria i will syc the data in Liferay table.

Thanks agin for your time.

thumbnail
Olaf Kock, modified 6 Years ago. Liferay Legend Posts: 6441 Join Date: 9/23/08 Recent Posts

On top of what David said: The solution to your ClassCastException will depend on the version of Liferay that you're using, and how you deploy your portlets: If you're in a pre-OSGi version, and deploy your portlets in different WARs, then they'll be loaded in different webapplications, and I almost expect you to run into those issues. This is where the common classloader comes in (e.g. global classpath, yuck).

In the OSGi world (7.x) you'll need proper dependency handling, but I'd rather expect compiler- than runtime errors if you mess this up. Thus, my assumption is that you're on 6.x (?)

Another workaround is to limit the cached data to standard data types, e.g. List, Map, String, etc. - not necessarily pretty, but neither is the use of the global classpath for the classes of objects that you just want to cache.

Anji E, modified 6 Years ago. Junior Member Posts: 49 Join Date: 11/18/14 Recent Posts
Olaf Kock:

On top of what David said: The solution to your ClassCastException will depend on the version of Liferay that you're using, and how you deploy your portlets: If you're in a pre-OSGi version, and deploy your portlets in different WARs, then they'll be loaded in different webapplications, and I almost expect you to run into those issues. This is where the common classloader comes in (e.g. global classpath, yuck).

In the OSGi world (7.x) you'll need proper dependency handling, but I'd rather expect compiler- than runtime errors if you mess this up. Thus, my assumption is that you're on 6.x (?)

Another workaround is to limit the cached data to standard data types, e.g. List, Map, String, etc. - not necessarily pretty, but neither is the use of the global classpath for the classes of objects that you just want to cache.


Hello Olaf,

Thanks a lot for your swift reply.

Am on Liferay 6.2

I was putting cutom class objects in the Cache. Hence it was not working. 

I tried cache Map<Integer,String> it is working as expected. But we need to cache map<Integer, List<CustomerData>>

thumbnail
David H Nebinger, modified 6 Years ago. Liferay Legend Posts: 14933 Join Date: 9/2/06 Recent Posts
Anji E:

I tried cache Map<Integer,String> it is working as expected. But we need to cache map<Integer, List<CustomerData>>

If you create a fake entity, you can use a simple service builder (blank) portlet to declare your CustomerData entity. Stick with the SB methods to create instances, then you can store those instances inside of the SingleVMPool.
 
Anji E, modified 6 Years ago. Junior Member Posts: 49 Join Date: 11/18/14 Recent Posts
I found a work around to store the objects in cache.

i.e Converted the objects into JSON string using import com.fasterxml.jackson.databind.ObjectMapper;ObjectMapper mapper = new ObjectMapper(); 
Now I could able to retrive the JSON string in any portlet and then am retriving the object back using ObjectMapper.
thumbnail
Olaf Kock, modified 6 Years ago. Liferay Legend Posts: 6441 Join Date: 9/23/08 Recent Posts
Anji E
I found a work around to store the objects in cache.

i.e Converted the objects into JSON string using import com.fasterxml.jackson.databind.ObjectMapper;ObjectMapper mapper = new ObjectMapper(); 
Now I could able to retrive the JSON string in any portlet and then am retriving the object back using ObjectMapper.
Given that this thread started with the intent of saving resources of repeatedly asking for the same large amount of data over and over again - for performance reasons: Have you measured the impact of repeated JSON parsing / deserializing? How does it compare to getting the data over and over again?

No answer necessary, just food for thought. Make sure that you actually optimized enough for the added complexity.