I have AWS RDS Postgresql as a database. I have a table for storing childbirth_data data with currently 5M records and another for storing message to child stats (not the actual messages) with 30M records. Monthly around 0.3M person records and 3M message records are added to these tables
Query:
SELECT message_date, message_from, sms_type, message_type, direction, status,
cd.state ,cd.date_uploaded,
FROM “Suvita”.messages m, “Suvita”.childbirth_data cd
where m.contact_record_id = cd.id
Explain analyze results:
Hash Join (cost=568680.28..6272284.94 rows=29688640 width=319) (actual time=5473.787..96501.807 rows=30893261 loops=1)
Hash Cond: (m.contact_record_id = cd.id)
-> Seq Scan on messages m (cost=0.00..2739237.50 rows=29719350 width=174) (actual time=2.364..41071.274 rows=30936614 loops=1)
-> Hash (cost=402612.68..402612.68 rows=5347568 width=121) (actual time=5448.157..5448.158 rows=5349228 loops=1)
Buckets: 32768 Batches: 256 Memory Usage: 3518kB
-> Seq Scan on childbirth_data cd (cost=0.00..402612.68 rows=5347568 width=121) (actual time=0.011..3013.382 rows=5349228 loops=1)
Planning Time: 18.349 ms
Execution Time: 97849.004 ms
I have made aggregated tables for dashboard where aggregate stats are required and materialized views for different states or types of messages depending on reporting needs.
My writes on these tables are via scheduled job which run every night so even if ‘writes’ become comparatively slow because of more indexes it will not be a concern
I have already put in the following indexes:
person – id,state, mother name, mother phone, rchid, telerivet_contact_id
Messages – id, contact_record_id, message_date and state
My Query is still very slow and the indexes are not being used in the query, it seems. How do I make postgresql use indexes?
Would partitioning help? The issue is that i was thinking of partitioning by state but 1 state has 90% of the data so i think it will not be helpful. Also I do not want to touch the code which does data input.
I am not database expert so please advise if there is anything else I can do to make the reads on the tables faster?