We have an index. It has a keyword field.
A document may have keywords such as: [‘cheesecake’, ‘cinnamon roll’].
If the input text contains the word ‘cheesecake’ there is no problem. But if the input text is something like ‘Today I have eaten a cinnamon roll’, there is no matching. We think the problem is that the input text is tokenized into single words, so neither ‘cinnamon’ nor ‘roll’ match our keyword ‘cinnamon roll’ (and we don’t want to! Only ‘cinnamon roll’ must match the keyword ‘cinnamon roll’).
How could we solve that? We thought of using shingles, but we didn’t find the proper way. And it is only the input search text that we need to tokenize.
This is our current query:
GET /food-suggestion/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"keywords": {
"query": "cinnamon roll",
"analyzer": "standard",
"operator": "or"
}
}
}
],
"filter": [
{
"term": {
"languageId": 1
}
},
{
"term": {
"webId": 2
}
}
]
}
}
}
Index mapping:
-
description
Text
-
id
Integer
-
keywords
Keyword
-
languageId
Integer
-
foodId
Long
-
title
Text
-
webId
Integer
This is a document of the index:
{
"description": "Bla bla bla",
"keywords": [
"cinnamon roll",
"crema catalana",
"cheesecake",
],
"languageId": 1,
"foodId": 13,
"title": "Sample title",
"webId": 2
}
josep lluis marin trabalon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.