This is a general question. I have two collections sitting in two different MongoDB databases. I need to run analytics which is I need compare the two datasets against either for similarities and differences. The documents are heavily nested documents on both sides. There are 5 and 3 mln record in them respectively. I am trying to understand what’s the best way to accomplish this data analysis task? Shall I go with dataframes in python? Or move everything into SQL Server tables and run the analysis there? Or use the native aggregation pipelines of MongoDB from VS Code/python?
What is the general practice? I am coming from a SQL world, and trying to get my head around MongoDB. I was expecting to find view, stored procedures, temporary tables and what not that I typically use to run such analysis. But how do people accomplish the same in MongoDB when it doesn’t have all these handy features? Your advise is much needed!