I was finding that pymongo was seemingly slower than I would expect for such a simple query, so I inspected the returned cursor object and found it contains the data I need. I then iterated over this value (method 3) and found it performs at least 10x faster than method 1 and 2 (which are the same).
Note that my collection contains ~3500 items which are all dictionaries with 10 or so keys, so it’s not terribly large.
I assume that there’s a problem in using method 3 as opposed to the prescribed method in the docs, but if someone can explain why I shouldn’t use method 3 and why it’s dangerious/problematic it would be appreciated.
I can’t really understand why this would be the case either – why isn’t the “normal” way of doing this already that fast.
Method 1 (~30ms execution time):
{x["some_id"]:x for x in my_collection.find()}
Method 2 (~30ms execution time):
{x["some_id"]:x for x in my_collection.aggregate([])}
Method 3 (~2ms execution time):
{x["some_id"]:x for x in my_collection.aggregate([]).__dict__['_CommandCursor__data']}