I need help understanding the solution.
What is in the project.
It is Django project. There is a table with users, dependent table to it many to many, a table with geolocation and a large amount of specific data that are stored for each user, let’s call them LargeData. Now these data are saved in ElasticSearch, which is bound to data from PostgreSQL, except for LargeData, they are not saved in PostgreSQL, but are saved in ES when registering a user.
Problem.
After receiving data from ES, the program filters these data additionally, it turns out slowly, so it was decided to move the filtering to ES or PostgreSQL.
- I considered both options and came to the conclusion that it is better to use PostgreSQL for filtering, and ES for getting LargeData, so you can use a convenient ORM in the program and have the advantages of ES to find the necessary data.
- Another programmer came to the conclusion that it is better to leave all data in ES and add data filtering there as well.
Questions.
I thought it’s better to use SQL to implement links between tables, search them through ORM, solve normalization problems, but the other programmer says that it’s ok to keep a lot of data in one document (which ES does), because it can handle it better, because PostgreSQL becomes slow with large amount of data, becomes long queries with geolocation.
And I thought, why do you need PostgreSQL if ES is so great, you can save data in SQL and then just transfer it to ES and it will handle everything better, faster and more convenient. Why do you need these normalizations, indexing if at some point SQL will just become slow no matter what you do.
Could you give some more links for me to read on this topic?
I tried to read some topics for this but it is always for me the same answer “ES is more convinient and faster”