Quantcast
Channel: Active questions tagged mongodb-atlas - Stack Overflow
Viewing all articles
Browse latest Browse all 271

Atlas Search Index Build Fail

$
0
0

I am working with a large dataset (several thousand documents) and I am trying to construct an Atlas Lucene search index for a particular field in these documents. To give an idea of my data, here's a simplified version of my documents:

{    name:'XYZ',    lastUpdated: date 1,    fundamentalData:{        description: stuff,        latUpdatedFA: date 2,        ...a lot more data    },    performanceData:[a lot of nested objects],    otherPerformanceData:[more nested objects],    ... more descriptive data}

The issue arises when I attempt to form a straightforward search index on the fundamentalData.description field. The system constantly returns a failure message stating:

'Your index could not be built: Unexpected error: DocValuesField "$type:date/lastUpdated" appears more than once in this document (only one value is allowed per field)'

This error suggests that the 'lastUpdated' field is duplicated in a document. However, I've verified using Python that this isn't the case. (see code at the end)

As a side note, I have a field fundamentalData.lastUpdatedFA which is structurally similar to lastUpdated, but I've confirmed that this should not be an issue as long as the names are not identical. I even performed an updateMany, changing the name of that field to something completely different. No luck

Interestingly, when I build the search index in a conventional way with db.collection.createIndex( { fundamentalData.description: "text" } ), everything operates as expected. I'm aware that the Atlas Search algorithm differs significantly from the legacy createIndex method, but I'm not sure how it's affecting my case here.

I would appreciate any insights or suggestions. Thanks!

Logan

def find_duplicate_lastUpdated(collection):    duplicate_lastUpdated_docs = []    for doc in collection.find():        lastUpdated_count = str(doc).count("'lastUpdated'")        if lastUpdated_count > 1:            duplicate_lastUpdated_docs.append(doc['name'])        time.sleep(0.01)  # sleep for 10 milliseconds    return duplicate_lastUpdated_docscollection = db["assetdatas"]duplicates = find_duplicate_lastUpdated(collection)len(duplicates)# for name in duplicates:#     print(name)

Viewing all articles
Browse latest Browse all 271

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>