RE: Liferay 7.0 - Spellchecker suggesting more words than expected

Jordi Martinez, modified 5 Years ago. New Member Posts: 5 Join Date: 1/24/20 Recent Posts
Hello everybody, first message here emoticon

I tried to add an spellchecker to my search, using a customized dictionary, to add the words that fits with my content (default one + some banks, cities and people names).
Everything was fine, I made all the frontend part to get the words in the search and show them... but the problem comes with some words, that don't appear in my dictionary, or some suggestions that don't really fit with what I want. Here some examples:
  • "infromation" -> "inform" (instead of "information")
  • "informapion" -> "information"
  • "conutry" -> "countri" (instead of "country" or "countries")
If I look for "Agrentina", it returns "argentina", "bink" returns "bank".... So there are some words (for now I just discovered that two cases, infromation and conutry that returns inform and countri respectively) that don't fit exactly with what I want. Anyone knows why is this happening?

Hope that everything is clear, but if not, I'll be glad to answer your questions.

P.S: I'm calling this method to get the data: 
String collatedKeywords = IndexSearcherHelperUtil.spellCheckKeywords(searchContext);
where the searchContext contains as the keyword's attribute the term I'm looking for.

Thanks in advance for your help!


 
Jordi Martinez, modified 5 Years ago. New Member Posts: 5 Join Date: 1/24/20 Recent Posts
Any help would be appreciated
thumbnail
Olaf Kock, modified 5 Years ago. Liferay Legend Posts: 6441 Join Date: 9/23/08 Recent Posts
Jordi Martinez:

I tried to add an spellchecker to my search, using a customized dictionary, to add the words that fits with my content (default one + some banks, cities and people names).
Everything was fine, I made all the frontend part to get the words in the search and show them... but the problem comes with some words, that don't appear in my dictionary, or some suggestions that don't really fit with what I want.
Sounds like your custom dictionary isn't large enough, and you're relying on stuff that a search index happens to have indexed... If you're spellchecking with the help of a search index, you're bound to whatever you have indexed. If your content happens to contain phrases that you need: Perfect. If it doesn't, you might just need to put it in.
Google knows all of the 100 different spellings for Britney Spears because people have searched for it. Without that (or without finding all of the wrong spellings somewhere on the internet) they wouldn't know. Even if nobody ever used the right spelling, they'd still have it in their index, as it will have been used somewhere. But with a small enough (indexed) sample of the web, it might not be.
I never expected to be able to link to this article in a serious business article.
Jordi Martinez, modified 5 Years ago. New Member Posts: 5 Join Date: 1/24/20 Recent Posts
Hi Olaf.

First of all, thanks for the response.
Second, happy to make you do something so impressive for the first time, Britney would be happy emoticon.
... and third: I'm using the custom dictionary, but also a "handmade" search engine and portlet (the customer asked for some specific search results and methods). So maybe the spellchecker is trying to get the data from some indexes that are not corresponded with the results of that engine.
So maybe the question can be: "Can I use the spellchecker to, using a defined dictionary, autocorrect the misspelled words?" ... or any suggestion of how to do it would be appreciated.
In my example:
  • "infromation" -> "inform" --> Would return  "information", as would match all the letters but 2 (dissordered)
  • "informapion" -> "information" --> OK
  • "conutry" -> "countri" --> Would return "country",  as would match all the letters but 2 (dissordered)

Thanks in advance for your reply! 
thumbnail
Olaf Kock, modified 5 Years ago. Liferay Legend Posts: 6441 Join Date: 9/23/08 Recent Posts
Jordi Martinez:

Second, happy to make you do something so impressive for the first time, Britney would be happy emoticon.

She might not be that happy, as I linked the german version. Here's the english article.
Oops ... I did it again.
Jordi Martinez:
... and third: I'm using the custom dictionary, but also a "handmade" search engine and portlet (the customer asked for some specific search results and methods). So maybe the spellchecker is trying to get the data from some indexes that are not corresponded with the results of that engine.
So maybe the question can be: "Can I use the spellchecker to, using a defined dictionary, autocorrect the misspelled words?" ... or any suggestion of how to do it would be appreciated.

Sorry, this goes beyond what I've tried in the past. I'm relying on someone else's memory here as I'd need to dive deep and implement it for my first time. Please don't hold it against me. ;)