Quantcast
Viewing all articles
Browse latest Browse all 271

How to Optimize MongoDB Aggregation Pipeline with Keyword Search and Sorting for Large Dataset?

I am working on a Laravel project that uses MongoDB via MongoDB Atlas. I have a collection with millions of documents, and I am running a complex search query that includes keyword search, sorting, and grouping. However, the query is taking too long to execute and is not performing well. Here is the query I am using:

[    {"$search": {"index": "keyword_search_index","queryString": {"defaultPath": "name","query": "((title: (\"product\") OR description: (\"product\") OR ai_tech_types.name: (\"product\") OR company.name: (\"product\") OR industry_focus.name: (\"product\") OR skills.name: (\"product\") OR title: (\"products\") OR description: (\"products\") OR ai_tech_types.name: (\"products\") OR company.name: (\"products\") OR industry_focus.name: (\"products\") OR skills.name: (\"products\")) OR (title: (\"data\") OR description: (\"data\") OR ai_tech_types.name: (\"data\") OR company.name: (\"data\") OR industry_focus.name: (\"data\") OR skills.name: (\"data\")) OR (title: (\"artificial intelligence\") OR description: (\"artificial intelligence\") OR ai_tech_types.name: (\"artificial intelligence\") OR company.name: (\"artificial intelligence\") OR industry_focus.name: (\"artificial intelligence\") OR skills.name: (\"artificial intelligence\") OR title: (\"artificial intelligences\") OR description: (\"artificial intelligences\") OR ai_tech_types.name: (\"artificial intelligences\") OR company.name: (\"artificial intelligences\") OR industry_focus.name: (\"artificial intelligences\") OR skills.name: (\"artificial intelligences\")) OR (title: (\"learning\") OR description: (\"learning\") OR ai_tech_types.name: (\"learning\") OR company.name: (\"learning\") OR industry_focus.name: (\"learning\") OR skills.name: (\"learning\") OR title: (\"learnings\") OR description: (\"learnings\") OR ai_tech_types.name: (\"learnings\") OR company.name: (\"learnings\") OR industry_focus.name: (\"learnings\") OR skills.name: (\"learnings\")))"            }        }    },    {"$match": {"post_date": {"$gte": {"$date": "2024-06-18T18:30:00Z"                },"$lte": {"$date": "2024-07-18T18:29:59Z"                }            }        }    },    {"$unwind": {"path": "$ai_tech_types"        }    },    {"$group": {"_id": {"$toLower": "$ai_tech_types.code"            },"name": {"$first": "$ai_tech_types.name"            },"code": {"$first": "$ai_tech_types.code"            },"count": {"$sum": 1            }        }    },    {"$project": {"name": 1,"code": 1,"count": 1        }    },    {"$sort": {"count": -1        }    }]

Requirements:Perform a text search across multiple fields.Filter documents based on a date range.Unwind an array field.Group by a specific field, count occurrences, and then sort by the count.

Issues:The query is taking too long to execute and often times out.The dataset is quite large, with millions of documents.

What I've Tried:Ensuring appropriate indexes are in place.Optimising the $search query.

Questions:How can I optimise this query for better performance given the large dataset?Are there any MongoDB Atlas-specific optimisation or configurations that I can leverage?Is there a more efficient way to structure the query or the data to achieve the same results?


Viewing all articles
Browse latest Browse all 271

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>