Recently I was looking at a MongoDB service and wondered if doing ReplaceOne was really the right call.
My document was pretty simple – some identifying information that doesn’t change, and an array containing informational state that grows (never updates) as my program runs. I set up the documents to update to the database every 30 seconds, so I’m not sure (without doing some additional work) what actually will be added to the array until it is time to update.
So I wrote this simple benchmarking program to try and see what (if any) the difference.
AddToSetEach
Adds items to the given field if they do not already exist. I picked this one so I could just provide Mongo the entire array and let them figure out if I need to add the item or not.
ReplaceOne
Replaces the content of the document, but keeps the database key intact. This means that the dbkey index won’t need to be updated every time an item is replaced, which is probably a good thing.
Benchmark Code
I wanted to try and be as fair as possible, so I made the program so it would take in two parameters: the number of items to use and which update method to use. Then the program executes in the following way:
1. Drop database (start with a clean slate)
2. Add items
3. Update items in code
4. Update items in database
5. Record results to file
I let the program run over night, increasing the number of items from 10,000 to 52,000.
Results and Takeaways
The results were pretty surprising – I was expecting the AddToSetEach to be a little bit faster since it wasn’t whole-hog replacing everything, but they were essentially neck-and-neck until we got to 40,000+ items.
Another thing to note is that there are other ways of doing this. If you know which items are going to be added to the array, you could use a Push, which I would assume is a little faster since Mongo doesn’t check for duplicates. I also didn’t run it on a really large dataset (1,000,000 items, for example), but, with this code and enough patience you definitely could.