I’m trying to improve the performance of one of our most expensive queries. I’m running EXPLAIN ANALYZE on the sandbox postgres 15 database, which has a much smaller dataset than the production postgres 15 database, and also likely doesn’t have the same data patterns that we have in production as well.
<code>SELECT COUNT(ut.id)
LEFT JOIN hippo.patient patient ON patient.id = ut.patient_id
LEFT JOIN hippo.health_center_group health_center_group ON health_center_group.id = ut.health_center_group_id
WHERE ut.test_group = ('')
AND ut.status IN (('active')::hippo.user_task_status)
AND ut.type IN (('review_consent_form')::hippo.user_task_type)
AND ut.assignee_id IS NULL
OR patient.health_center_permission_pending NOT IN (('exception'), ('multiple_pending'))
ut.type = 'review_consent_form'
AND ut.status IN ('active', 'exception')
AND health_center_group.patients_imported_at_tg IS NULL
<code>SELECT COUNT(ut.id)
FROM hippo.user_task ut
LEFT JOIN hippo.patient patient ON patient.id = ut.patient_id
LEFT JOIN hippo.health_center_group health_center_group ON health_center_group.id = ut.health_center_group_id
WHERE ut.test_group = ('')
AND ut.status IN (('active')::hippo.user_task_status)
AND ut.type IN (('review_consent_form')::hippo.user_task_type)
AND ut.assignee_id IS NULL
AND (
patient IS NULL
OR patient.health_center_permission_pending NOT IN (('exception'), ('multiple_pending'))
)
AND (
NOT (
ut.type = 'review_consent_form'
AND ut.status IN ('active', 'exception')
AND health_center_group.patients_imported_at_tg IS NULL
)
)
</code>
SELECT COUNT(ut.id)
FROM hippo.user_task ut
LEFT JOIN hippo.patient patient ON patient.id = ut.patient_id
LEFT JOIN hippo.health_center_group health_center_group ON health_center_group.id = ut.health_center_group_id
WHERE ut.test_group = ('')
AND ut.status IN (('active')::hippo.user_task_status)
AND ut.type IN (('review_consent_form')::hippo.user_task_type)
AND ut.assignee_id IS NULL
AND (
patient IS NULL
OR patient.health_center_permission_pending NOT IN (('exception'), ('multiple_pending'))
)
AND (
NOT (
ut.type = 'review_consent_form'
AND ut.status IN ('active', 'exception')
AND health_center_group.patients_imported_at_tg IS NULL
)
)
<code> Aggregate (cost=2591.66..2591.67 rows=1 width=8) (actual time=4.003..4.005 rows=1 loops=1)
-> Nested Loop Left Join (cost=83.67..2589.84 rows=727 width=8) (actual time=0.730..3.980 rows=251 loops=1)
Filter: ((patient.* IS NULL) OR ((patient.health_center_permission_pending)::text <> ALL ('{exception,multiple_pending}'::text[])))
Rows Removed by Filter: 4
-> Hash Left Join (cost=83.39..480.50 rows=728 width=16) (actual time=0.582..3.710 rows=255 loops=1)
Hash Cond: (ut.health_center_group_id = health_center_group.id)
Filter: ((ut.type <> 'review_consent_form'::hippo.user_task_type) OR (ut.status <> ALL ('{active,exception}'::hippo.user_task_status[])) OR (health_center_group.patients_imported_at_tg IS NOT NULL))
Rows Removed by Filter: 24
-> Bitmap Heap Scan on user_task ut (cost=55.68..450.84 rows=738 width=32) (actual time=0.324..3.353 rows=279 loops=1)
Recheck Cond: (type = 'review_consent_form'::hippo.user_task_type)
Filter: ((assignee_id IS NULL) AND ((test_group)::text = ''::text) AND (status = 'active'::hippo.user_task_status))
Rows Removed by Filter: 4416
-> Bitmap Index Scan on user_task_type_idx (cost=0.00..55.50 rows=4695 width=0) (actual time=0.211..0.211 rows=4735 loops=1)
Index Cond: (type = 'review_consent_form'::hippo.user_task_type)
-> Hash (cost=23.98..23.98 rows=298 width=16) (actual time=0.238..0.238 rows=298 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Seq Scan on health_center_group (cost=0.00..23.98 rows=298 width=16) (actual time=0.011..0.152 rows=298 loops=1)
-> Index Scan using patient_pkey on patient (cost=0.29..2.88 rows=1 width=1000) (actual time=0.001..0.001 rows=0 loops=255)
Index Cond: (id = ut.patient_id)
<code> Aggregate (cost=2591.66..2591.67 rows=1 width=8) (actual time=4.003..4.005 rows=1 loops=1)
-> Nested Loop Left Join (cost=83.67..2589.84 rows=727 width=8) (actual time=0.730..3.980 rows=251 loops=1)
Filter: ((patient.* IS NULL) OR ((patient.health_center_permission_pending)::text <> ALL ('{exception,multiple_pending}'::text[])))
Rows Removed by Filter: 4
-> Hash Left Join (cost=83.39..480.50 rows=728 width=16) (actual time=0.582..3.710 rows=255 loops=1)
Hash Cond: (ut.health_center_group_id = health_center_group.id)
Filter: ((ut.type <> 'review_consent_form'::hippo.user_task_type) OR (ut.status <> ALL ('{active,exception}'::hippo.user_task_status[])) OR (health_center_group.patients_imported_at_tg IS NOT NULL))
Rows Removed by Filter: 24
-> Bitmap Heap Scan on user_task ut (cost=55.68..450.84 rows=738 width=32) (actual time=0.324..3.353 rows=279 loops=1)
Recheck Cond: (type = 'review_consent_form'::hippo.user_task_type)
Filter: ((assignee_id IS NULL) AND ((test_group)::text = ''::text) AND (status = 'active'::hippo.user_task_status))
Rows Removed by Filter: 4416
Heap Blocks: exact=306
-> Bitmap Index Scan on user_task_type_idx (cost=0.00..55.50 rows=4695 width=0) (actual time=0.211..0.211 rows=4735 loops=1)
Index Cond: (type = 'review_consent_form'::hippo.user_task_type)
-> Hash (cost=23.98..23.98 rows=298 width=16) (actual time=0.238..0.238 rows=298 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Seq Scan on health_center_group (cost=0.00..23.98 rows=298 width=16) (actual time=0.011..0.152 rows=298 loops=1)
-> Index Scan using patient_pkey on patient (cost=0.29..2.88 rows=1 width=1000) (actual time=0.001..0.001 rows=0 loops=255)
Index Cond: (id = ut.patient_id)
Planning Time: 1.524 ms
Execution Time: 4.105 ms
(22 rows)
</code>
Aggregate (cost=2591.66..2591.67 rows=1 width=8) (actual time=4.003..4.005 rows=1 loops=1)
-> Nested Loop Left Join (cost=83.67..2589.84 rows=727 width=8) (actual time=0.730..3.980 rows=251 loops=1)
Filter: ((patient.* IS NULL) OR ((patient.health_center_permission_pending)::text <> ALL ('{exception,multiple_pending}'::text[])))
Rows Removed by Filter: 4
-> Hash Left Join (cost=83.39..480.50 rows=728 width=16) (actual time=0.582..3.710 rows=255 loops=1)
Hash Cond: (ut.health_center_group_id = health_center_group.id)
Filter: ((ut.type <> 'review_consent_form'::hippo.user_task_type) OR (ut.status <> ALL ('{active,exception}'::hippo.user_task_status[])) OR (health_center_group.patients_imported_at_tg IS NOT NULL))
Rows Removed by Filter: 24
-> Bitmap Heap Scan on user_task ut (cost=55.68..450.84 rows=738 width=32) (actual time=0.324..3.353 rows=279 loops=1)
Recheck Cond: (type = 'review_consent_form'::hippo.user_task_type)
Filter: ((assignee_id IS NULL) AND ((test_group)::text = ''::text) AND (status = 'active'::hippo.user_task_status))
Rows Removed by Filter: 4416
Heap Blocks: exact=306
-> Bitmap Index Scan on user_task_type_idx (cost=0.00..55.50 rows=4695 width=0) (actual time=0.211..0.211 rows=4735 loops=1)
Index Cond: (type = 'review_consent_form'::hippo.user_task_type)
-> Hash (cost=23.98..23.98 rows=298 width=16) (actual time=0.238..0.238 rows=298 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Seq Scan on health_center_group (cost=0.00..23.98 rows=298 width=16) (actual time=0.011..0.152 rows=298 loops=1)
-> Index Scan using patient_pkey on patient (cost=0.29..2.88 rows=1 width=1000) (actual time=0.001..0.001 rows=0 loops=255)
Index Cond: (id = ut.patient_id)
Planning Time: 1.524 ms
Execution Time: 4.105 ms
(22 rows)
I’ve found that changing the 11th line of the query from patient IS NULL
to ut.patient_id IS NULL
significantly decreases the cost, however it also significantly increases the “actual time” and “execution time”:
<code>Aggregate (cost=1551.69..1551.70 rows=1 width=8) (actual time=15.309..15.311 rows=1 loops=1)
-> Hash Left Join (cost=1150.85..1549.88 rows=727 width=8) (actual time=13.147..15.291 rows=251 loops=1)
Hash Cond: (ut.patient_id = patient.id)
Filter: ((ut.patient_id IS NULL) OR ((patient.health_center_permission_pending)::text <> ALL ('{exception,multiple_pending}'::text[])))
Rows Removed by Filter: 4
-> Hash Left Join (cost=83.39..480.50 rows=728 width=16) (actual time=0.932..3.174 rows=255 loops=1)
Hash Cond: (ut.health_center_group_id = health_center_group.id)
Filter: ((ut.type <> 'review_consent_form'::hippo.user_task_type) OR (ut.status <> ALL ('{active,exception}'::hippo.user_task_status[])) OR (health_center_group.patients_imported_at_tg IS NOT NULL))
Rows Removed by Filter: 24
-> Bitmap Heap Scan on user_task ut (cost=55.68..450.84 rows=738 width=32) (actual time=0.468..2.637 rows=279 loops=1)
Recheck Cond: (type = 'review_consent_form'::hippo.user_task_type)
Filter: ((assignee_id IS NULL) AND ((test_group)::text = ''::text) AND (status = 'active'::hippo.user_task_status))
Rows Removed by Filter: 4416
-> Bitmap Index Scan on user_task_type_idx (cost=0.00..55.50 rows=4695 width=0) (actual time=0.267..0.267 rows=4735 loops=1)
Index Cond: (type = 'review_consent_form'::hippo.user_task_type)
-> Hash (cost=23.98..23.98 rows=298 width=16) (actual time=0.445..0.446 rows=298 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Seq Scan on health_center_group (cost=0.00..23.98 rows=298 width=16) (actual time=0.019..0.381 rows=298 loops=1)
-> Hash (cost=711.65..711.65 rows=28465 width=13) (actual time=11.988..11.989 rows=28465 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 1508kB
-> Seq Scan on patient (cost=0.00..711.65 rows=28465 width=13) (actual time=0.014..7.222 rows=28465 loops=1)
Execution Time: 15.512 ms
<code>Aggregate (cost=1551.69..1551.70 rows=1 width=8) (actual time=15.309..15.311 rows=1 loops=1)
-> Hash Left Join (cost=1150.85..1549.88 rows=727 width=8) (actual time=13.147..15.291 rows=251 loops=1)
Hash Cond: (ut.patient_id = patient.id)
Filter: ((ut.patient_id IS NULL) OR ((patient.health_center_permission_pending)::text <> ALL ('{exception,multiple_pending}'::text[])))
Rows Removed by Filter: 4
-> Hash Left Join (cost=83.39..480.50 rows=728 width=16) (actual time=0.932..3.174 rows=255 loops=1)
Hash Cond: (ut.health_center_group_id = health_center_group.id)
Filter: ((ut.type <> 'review_consent_form'::hippo.user_task_type) OR (ut.status <> ALL ('{active,exception}'::hippo.user_task_status[])) OR (health_center_group.patients_imported_at_tg IS NOT NULL))
Rows Removed by Filter: 24
-> Bitmap Heap Scan on user_task ut (cost=55.68..450.84 rows=738 width=32) (actual time=0.468..2.637 rows=279 loops=1)
Recheck Cond: (type = 'review_consent_form'::hippo.user_task_type)
Filter: ((assignee_id IS NULL) AND ((test_group)::text = ''::text) AND (status = 'active'::hippo.user_task_status))
Rows Removed by Filter: 4416
Heap Blocks: exact=306
-> Bitmap Index Scan on user_task_type_idx (cost=0.00..55.50 rows=4695 width=0) (actual time=0.267..0.267 rows=4735 loops=1)
Index Cond: (type = 'review_consent_form'::hippo.user_task_type)
-> Hash (cost=23.98..23.98 rows=298 width=16) (actual time=0.445..0.446 rows=298 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Seq Scan on health_center_group (cost=0.00..23.98 rows=298 width=16) (actual time=0.019..0.381 rows=298 loops=1)
-> Hash (cost=711.65..711.65 rows=28465 width=13) (actual time=11.988..11.989 rows=28465 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 1508kB
-> Seq Scan on patient (cost=0.00..711.65 rows=28465 width=13) (actual time=0.014..7.222 rows=28465 loops=1)
Planning Time: 13.285 ms
Execution Time: 15.512 ms
(24 rows)
</code>
Aggregate (cost=1551.69..1551.70 rows=1 width=8) (actual time=15.309..15.311 rows=1 loops=1)
-> Hash Left Join (cost=1150.85..1549.88 rows=727 width=8) (actual time=13.147..15.291 rows=251 loops=1)
Hash Cond: (ut.patient_id = patient.id)
Filter: ((ut.patient_id IS NULL) OR ((patient.health_center_permission_pending)::text <> ALL ('{exception,multiple_pending}'::text[])))
Rows Removed by Filter: 4
-> Hash Left Join (cost=83.39..480.50 rows=728 width=16) (actual time=0.932..3.174 rows=255 loops=1)
Hash Cond: (ut.health_center_group_id = health_center_group.id)
Filter: ((ut.type <> 'review_consent_form'::hippo.user_task_type) OR (ut.status <> ALL ('{active,exception}'::hippo.user_task_status[])) OR (health_center_group.patients_imported_at_tg IS NOT NULL))
Rows Removed by Filter: 24
-> Bitmap Heap Scan on user_task ut (cost=55.68..450.84 rows=738 width=32) (actual time=0.468..2.637 rows=279 loops=1)
Recheck Cond: (type = 'review_consent_form'::hippo.user_task_type)
Filter: ((assignee_id IS NULL) AND ((test_group)::text = ''::text) AND (status = 'active'::hippo.user_task_status))
Rows Removed by Filter: 4416
Heap Blocks: exact=306
-> Bitmap Index Scan on user_task_type_idx (cost=0.00..55.50 rows=4695 width=0) (actual time=0.267..0.267 rows=4735 loops=1)
Index Cond: (type = 'review_consent_form'::hippo.user_task_type)
-> Hash (cost=23.98..23.98 rows=298 width=16) (actual time=0.445..0.446 rows=298 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 22kB
-> Seq Scan on health_center_group (cost=0.00..23.98 rows=298 width=16) (actual time=0.019..0.381 rows=298 loops=1)
-> Hash (cost=711.65..711.65 rows=28465 width=13) (actual time=11.988..11.989 rows=28465 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 1508kB
-> Seq Scan on patient (cost=0.00..711.65 rows=28465 width=13) (actual time=0.014..7.222 rows=28465 loops=1)
Planning Time: 13.285 ms
Execution Time: 15.512 ms
(24 rows)
Cost significantly decreases from 2591.66..2591.67 to 1551.69..1551.70 . However, the “actual time” increases from 4.003..4.005 to 15.309..15.311 and the “execution time” increases from 4.105 ms to 15.512 ms.
I know “cost” is just an estimate (and yes I did run analyze on the tables before doing the analysis), however I don’t know how to weigh these metrics against each other when doing analysis on the sandbox DB. Will this change result in a performance improvement or degradation?