I was working on an aggregation pipeline to cleanup duplicate data from a collection on the database. But it was throwing the following error even after setting { allowDiskUse: true }
.
Err: "MongoServerError: PlanExecutor error during aggregation :: caused by :: Exceeded memory limit for $group, but didn't allow external spilling; pass allowDiskUse:true to opt in"
The objective
was to delete all duplicates (same phoneNumber with same startTime) after retaining the first entry from the database.
Below is my pipeline:
let pipeline = [ { $group: { _id: { phoneNumber: "$details.phoneNumber", startTime: "$details.startTime", }, docs: { $push: "$$ROOT" }, count: { $sum: 1 }, }, }, { $match: { count: { $gt: 1 }, }, }, { $unwind: "$docs", }, { $sort: {"docs.createdAt": 1, }, }, { $skip: 1, }, { $replaceRoot: { newRoot: "$docs" }, },]
Here, while calling aggregate, i passed in "allowDiskUse" as mentioned in the types as below, but still it's throwing the same error.
This is how i added the option:
const options = { allowDiskUse: true, maxTimeMS: 10000,}
And then using mongodb node driver v5.8
, i used aggregation operation like below:
db.collection(collectionName).aggregate(pipeline, options)
Here, i added allowDiskUse like the error message suggested, but still it's not working. Even in mongodb compass its throwing the same error.
So, what's the issue here?
And if i could do my objective in any other way, please comment.
Thanks