Making Private Pages Searchable

Overview

While working on an intranet with on of our clients in the APAC region, we discovered that private pages are not indexed correctly, thus having issues when searching content. Product team is already aware of this and have already logged a ticket here. While adding features to Liferay is all good and well, we don't have the time to wait for the product team, and thus this workaround was created. However, I'd still recommend using the solution provided by product team over this workaround once that becomes available.

Why aren't private pages searchable?

Private pages are not indexed properly since the layout crawler responsible for them uses Guest user access. 

Workaround Details

So the workaround is a custom module that you deploy to your Liferay workspace. Below are the steps we did during development.

  1. Create a custom page crawler that's specifically targets private pages. The custom page crawler will leverage OOTB auto-login of the Remember Me feature to authenticate and access private pages.
  2. Attach a dummy user to the said page crawler. The dummy user should be a site member of the site where the private pages are stored.
  3. Call the custom page crawler after indexing via a custom implementation of the IndexerPostProcessor interface. You can read more about this here.

Source Code

The source code contains 3 major files

  • CrawlerConfiguration.java - Contains the hashed username / password of the dummy user that we will use to crawl the private pages.
  • LayoutCrawler.java - The custom layout crawler that crawls private pages.
  • LayoutIndexerPostProcessor.java - The post processor hook that calls our custom crawler.

You can find the source code here and the steps on how to set it up here.

Acknowledgements

I would like to thank Ha Tang for coming up with this brilliant idea and implementing it for our client.  As always, constructive feedback is appreciated.

 

 

0

More Blog Entries