I am working on a Laravel project that uses MongoDB via MongoDB Atlas. I have a collection with millions of documents, and I am running a complex search query that includes keyword search, sorting, and grouping. However, the query is taking too long to execute and is not performing well. Here is the query I am using:
[
{
"$search": {
"index": "keyword_search_index",
"queryString": {
"defaultPath": "name",
"query": "((title: ("product") OR description: ("product") OR ai_tech_types.name: ("product") OR company.name: ("product") OR industry_focus.name: ("product") OR skills.name: ("product") OR title: ("products") OR description: ("products") OR ai_tech_types.name: ("products") OR company.name: ("products") OR industry_focus.name: ("products") OR skills.name: ("products")) OR (title: ("data") OR description: ("data") OR ai_tech_types.name: ("data") OR company.name: ("data") OR industry_focus.name: ("data") OR skills.name: ("data")) OR (title: ("artificial intelligence") OR description: ("artificial intelligence") OR ai_tech_types.name: ("artificial intelligence") OR company.name: ("artificial intelligence") OR industry_focus.name: ("artificial intelligence") OR skills.name: ("artificial intelligence") OR title: ("artificial intelligences") OR description: ("artificial intelligences") OR ai_tech_types.name: ("artificial intelligences") OR company.name: ("artificial intelligences") OR industry_focus.name: ("artificial intelligences") OR skills.name: ("artificial intelligences")) OR (title: ("learning") OR description: ("learning") OR ai_tech_types.name: ("learning") OR company.name: ("learning") OR industry_focus.name: ("learning") OR skills.name: ("learning") OR title: ("learnings") OR description: ("learnings") OR ai_tech_types.name: ("learnings") OR company.name: ("learnings") OR industry_focus.name: ("learnings") OR skills.name: ("learnings")))"
}
}
},
{
"$match": {
"post_date": {
"$gte": {
"$date": "2024-06-18T18:30:00Z"
},
"$lte": {
"$date": "2024-07-18T18:29:59Z"
}
}
}
},
{
"$unwind": {
"path": "$ai_tech_types"
}
},
{
"$group": {
"_id": {
"$toLower": "$ai_tech_types.code"
},
"name": {
"$first": "$ai_tech_types.name"
},
"code": {
"$first": "$ai_tech_types.code"
},
"count": {
"$sum": 1
}
}
},
{
"$project": {
"name": 1,
"code": 1,
"count": 1
}
},
{
"$sort": {
"count": -1
}
}
]
Requirements:
Perform a text search across multiple fields.
Filter documents based on a date range.
Unwind an array field.
Group by a specific field, count occurrences, and then sort by the count.
Issues:
The query is taking too long to execute and often times out.
The dataset is quite large, with millions of documents.
What I’ve Tried:
Ensuring appropriate indexes are in place.
Optimising the $search query.
Questions:
How can I optimise this query for better performance given the large dataset?
Are there any MongoDB Atlas-specific optimisation or configurations that I can leverage?
Is there a more efficient way to structure the query or the data to achieve the same results?