Implement  Predictive Search  in Liferay

Part 1: Simple Autocomplete

Liferay doesn’t have predictive search, but it doesn’t mean it would be difficult to implement. In the following two blogs I’ll demonstrate implementing it in two flavors: an autocomplete suggesting previous, successful keywords and a search as you type flavor suggesting matching content as keywords are entered. 

The goal is to do it as simply as possible leveraging Liferay’s standard search functionalities as much as possible. But, one size does not fit it all. Although my number one choice for backend would probably be in all the flavors of predictive search a REST based approach, there are scenarios like search as you type where it currently fits poorly in. I’ll come back to these challenges in the second part of this blog series.

This first example is simple and fast. I won’t go deep in the details but will try to show the outline for how it can be done so that you can customize it to your needs. Link to the full source code is available at the end of this blog. 

Now let’s start implementing autocomplete. Steps outline is as follows:

  1. Create a REST endpoint for predictive search
  2. Optimize index mappings
  3. Implement the predictive search REST resource
  4. Reindex
  5. Give Guest users access to the REST endpoint
  6. Enable keyword indexing
  7. Implement a widget template for search bar to enable autocomplete
  8. Feed with test data, test and have fun!

Tools used were:

  1. Liferay Developer Studio 3.9.5
  2. Liferay CE 7.4 GA10
  3. Liferay Workspace environment

1. Create a REST Endpoint for Predictive Search

I’ll start by creating a REST builder project autocomplete-rest in my Liferay Workspace and defining the schema rest-openapi.yaml. The key parts there are the /suggestions resource:

paths:
    /suggestions:
        get:
            operationId: getSuggestions
            parameters:
                - in: query
                  name: keywords
                  required: true
                  schema:
                      type: string
                - in: query
                  name: languageId
                  required: true
                  schema:
                      type: string
                - in: query
                  name: size
                  required: true
                  schema:
                      format: int64
                      type: integer

And the suggestions entities:

components:
    schemas:
        Suggestion:
            properties:
                text:
                    type: string
            type: object
        Suggestions:
            properties:
                suggestions:
                    items:
                        $ref: "#/components/schemas/Suggestion"
                    type: array
            type: object
    

2. Optimize Index Mappings

To create keywords suggestions  I’ll be using the Liferay out of the box keyword / query indexing feature. This  feature is a bit outdated and doesn’t have any kind of data management user interface but it will do as the data backbone for the demonstration here. A few important bits of information about it, though: Liferay query indexing does not scope keywords by other than virtual instance and doesn't have any kind of permission checking. This means that on whichever site keywords are stored, they will be served on any site in the same virtual instance. 

As a next step, after running the REST builder I'll slightly optimize Liferay keyword indexing settings to better support autocompletion. By default data is stored in the index in a localized keywordSearch_xx_YY field which is of type text. To get relevant suggestions higher on my autocompletion box I'll change the field mapping to Elasticsearch's search-as-you-type to add there 2- and 3-shingles as subfields. The original field mapping remains however untouched. More information of this field type can be found in Elasticsearch documentation.

This customization is done with the help of Liferay's IndexSettingsContributor API. The mapping change as such is a very small one but I'm doing it only for English language here. Add an entry for any language you need to support:


{
	"properties": {
		"keywordSearch_en_US": {
			"analyzer": "english",
			"store": true,
			"type": "search_as_you_type"
		}
	}
}

3. Implement the Predictive Search REST Resource

Next I'll implement REST resource class which takes care of serving the suggestions. I'll be relying on Liferay Low Level search API to search for matches because of better performance and because permission checks on the data I'm going to use is not even possible. The key thing in the implementation class there is how the actual search query is built. I'll be using the way suggested in Elasticsearch documentation as is:


private Query _getSearchQuery(String keywords, String languageId) {
	BooleanQuery booleanQuery = _queries.booleanQuery();

	booleanQuery.addFilterQueryClauses(
		_queries.term(Field.LANGUAGE_ID, languageId),
		_queries.term(Field.TYPE, "querySuggestion"));

	String keywordSearchField = 
			LocalizationUtil.getLocalizedName(
			Field.KEYWORD_SEARCH, languageId);
	
	MultiMatchQuery multiMatchQuery = _queries.multiMatch(
		keywords,
		keywordSearchField,
		StringBundler.concat(keywordSearchField, "._2gram"),
		StringBundler.concat(keywordSearchField, "._3gram"));

	multiMatchQuery.setType(MultiMatchQuery.Type.BOOL_PREFIX);

	booleanQuery.addMustQueryClauses(multiMatchQuery);

	return booleanQuery;
}

4. Reindex

After the endpoint resource is implemented, it's time to deploy the modules and do a reindex to apply the new index mappings. Reindexing can be done in Control Panel -> Search -> Index Actions. After reindexing is done, you can check on the Field Mappings -tab that the new settings for keywordSearch_en_US were applied

5. Give Guest Users Access to the REST Endpoint

By default Liferay’s service access policies require an authenticated users to access any of the REST endpoints. I want Guest user access to be able to use autocomplete so I have define a policy for that. Go to the Control Panel -> Service Access Policies and add an entry for service class fi.soveltia.liferay.autocomplete.rest.internal.resource.v1_0.SuggestionResourceImpl as follows:


 

6. Enable Keyword Indexing

Next the keyword indexing has to enabled so that successful previous queries will be stored in the index and are able to being served as suggestions to the users. On a search page add the Suggestions widget, open the configuration and configure as follows:


This effectively stores keywords when there are at least 5 results for them.

7. Implement a Widget Template for Search Bar to Enable Autocomplete

The backend is done but I need the frontend to be able to present the suggestions. One approach would be to implement a custom search bar widget which would certainly be a good and by far the most flexible option but we can do it simpler with a search bar widget template.

For the autocomplete user interface component there are several options available, but I’ll be using my old friend DevBridge Autocomplete which is both easy to use but has a good amount of out of the box customization options to make it flexible for various kinds scenarios. See the documentation to see if it fits to your purposes.

To create a widget template, go to Portal Global site -> Design -> Templates -> Widget Templates and click the plus icon to add a new search bar template. The template can be found here.

8. Feed With Test data, Test and Have Fun!

Almost there but we need test contents and successful keywords stored in the index. 

Add contents to your portal so that for your test keywords you’ll get more that 5 results (Query Indexing Threshold).I’ll simply add some Liferay Whitepapers PDF to the document library and start doing test searches giving me 8-10 results each. After I'm done and start typing on the search bar, I have a working autocomplete:

Closing Notes

This blog and the attached source code hopefully demonstrates how a simple autocomplete on Liferay can be implement with very few lines of code.

You might want to ask why I was not using Elasticsearch suggesters but just regular queries as they are supposed to be optimised for fast suggestions retrieval? That's a good question and there some answers for it.

First, to use Completion suggester on an index field is has to be of completion type . That's of course no problem if index mappings are customized but out of the box, there are no fields in the Liferay index having that type.

Secondly, as mentioned earlier, suggesters do not support any kind of scoping. It's not possible to scope keywords by groupId even it the information was available in a stored keyword entry. In this example I didn't use scoping other than by languageId, but in a production environment I'd want to scope by groupId to both protect potentially sensitive keywords from spreading but also because not all keyword suggestions from any site would make any sense on some particular site - if you scope the search to that group. With a Context suggester it would be theoretically possible to scope suggestions, but that's not even supported in Liferay portal.

One benefit of using regular queries over suggesters is additionally that they give a lot more tools easy to use tools for relevance tuning. 

Thanks for reading! See you in the part two where I'm implementing a search as you type flavor of predictive search.

Complete source code can be found here.

2
Blogs