RE: Entity indexation - bulk creation, performance issue

thumbnail
Eric COQUELIN, modified 9 Years ago. Expert Posts: 254 Join Date: 11/3/13 Recent Posts
Dear all,

I have created a new entity

<entity name="TaxFormField" local-service="true" remote-service="false" uuid="true">

		<!-- PK fields -->
		<column name="taxFormFieldId" type="long" primary="true"></column>

		...

		<reference package-path="com.liferay.portlet.asset" entity="AssetEntry" />

	</entity>


Then, I have implemented a bulk creation module to load about 10k lines from a file which corresponds each to an instance of the entity. Using the same transaction, I call the add method

public TaxFormField addTaxFormField(TaxFormField taxFormField, ServiceContext serviceContext) throws PortalException, SystemException {

		taxFormField.setGroupId(serviceContext.getScopeGroupId());
		taxFormField.setCompanyId(serviceContext.getCompanyId());
		taxFormField.setCreateDate(serviceContext.getCreateDate(new Date()));
		taxFormField.setStatus(WorkflowConstants.STATUS_DRAFT);

...

		taxFormField = this.addTaxFormField(taxFormField);
		return taxFormField;
	}


which calls itself the default generated method
taxFormField = this.addTaxFormField(taxFormField);


That method has annotation for indexation

@Indexable(type = IndexableType.REINDEX)


As i have defined a specific indexer for this entity in my XML file, it calls reindex each time an entity is added

<indexer-class>com.xxx.yyy.tax.indexer.TaxFormFieldIndexer</indexer-class>


As a result, file loading takes an eternity to complete... I wish I could:
  • bulk index once the file is loaded - that is possible but it continues indexing for each entity being added
  • optimize entity indexation so that it uses same transaction and commit only once


I have no idea how to achieve second point so I opted for the first one.

However, calling directly the persistence method, which is not annoted with @indexable, triggers the listeners...

taxFormFieldPersistence.update(taxFormField);


And the problem remains as it looks like there is an indexer listener.

How can I optimize the indexation of entity while bulk loading?

Thank you in advance for your help.
thumbnail
David H Nebinger, modified 9 Years ago. Liferay Legend Posts: 14933 Join Date: 9/2/06 Recent Posts
Actually I think this should all be possible...

So say in your entity service impl you add a local boolean field, "indexing" and add appropriate getter/setter. This becomes your flag as to whether to actually index or not.

Your bulk process would disable indexing, do the bulk load as-is, then enable indexing.

I'd actually add code to the setter that, when you are enabling the indexing, have it invoke the same reindexing as the control panel's individual indexing option (starts the background thread to reindex all entities).

Your indexer class, well that should have access to the service layer so every time it is asked to index a doc, you check the enabled flag and, if not enabled, just return null (a return of null effectively means "nothing here to index").

This should allow you to delay indexing of this entity type until the bulk load is complete and reindex afterwards, but you must remember that you will be consuming a lot of time reindexing all of the entities. Rather than reindexing all entities, you might want to keep track of the added ids during the bulk load and, after reenabling indexing, manually reindex the added/updated entities individually.

Best I can offer, there really isn't a way to OOTB disable individual indexers that wouldn't require a lot of other invasive changes to the core.
thumbnail
Eric COQUELIN, modified 9 Years ago. Expert Posts: 254 Join Date: 11/3/13 Recent Posts
Thanks for replying so quickly.

To be honnest, I wished there was an OOTB solution but it looks like we need workaround.

Thank you again.
thumbnail
Jorge Díaz, modified 9 Years ago. Liferay Master Posts: 753 Join Date: 1/9/14 Recent Posts
Hi Eric,

Eric COQUELIN:
Thanks for replying so quickly.

To be honnest, I wished there was an OOTB solution but it looks like we need workaround.


In 7.0/DXP that functionality is implemented in LPS-56593

All indexation requests are buffered and sent together after database transaction commit
thumbnail
David H Nebinger, modified 9 Years ago. Liferay Legend Posts: 14933 Join Date: 9/2/06 Recent Posts
Jorge Díaz:
All indexation requests are buffered and sent together after database transaction commit


Except that transactions are wrapped around the addXxxx() methods so this technique will not help you in a batch upload mode.
thumbnail
Jorge Díaz, modified 9 Years ago. Liferay Master Posts: 753 Join Date: 1/9/14 Recent Posts
David H Nebinger:
Jorge Díaz:
All indexation requests are buffered and sent together after database transaction commit


Except that transactions are wrapped around the addXxxx() methods so this technique will not help you in a batch upload mode.

I think you can create your own transaction using TransactionInvokerUtil.invoker in same way Staging/LAR does during import task, see:
TransactionInvokerUtil.invoker will create a new database transaction if you are outside of any and will execute inside of it the call method of the callable class.
For LayoutImportBackgroundTaskExecutor example, it will execute "call" method of the inner class LayoutImportBackgroundTaskExecutor.LayoutImportCallable

You can also create a "dummy" service builder project in order to create a *ServiceImpl class with transactional method and add all the logic there, see: But I think it is better to follow the Staging/LAR approach and call TransactionInvokerUtil.invoker