I have a query of the following shape:
SELECT d1.attrA, d1.attrB FROM fact f JOIN dim1 d1 ON f.id = d1.fact_id
WHERE f.id = ${factId} AND f.date BETWEEN ${dateFrom} AND ${dateTo}
-- uses index on fact(id, date)
UNION ALL
SELECT d2.attrA, d2.attrB FROM fact f JOIN dim2 d2 ON f.id = d2.fact_id
WHERE f.id = ${factId} AND d2.date BETWEEN ${dateFrom} AND ${dateTo}
-- uses index on dim2(fact_id, date)
There are indexes in place to make it efficient.
If I refactor to extract common criteria to outer CTE (I’m trying to turn it into a view later):
WITH d1 AS (
SELECT d1.attrA, f.attrB, f.id AS fact_id, f.date AS x_date
FROM fact f JOIN dim1 ON f.id = d1.fact_id
),
d2 AS (
SELECT d2.attrA, d2.attrB, f.id AS fact_id, d2.date AS x_date
FROM fact f JOIN dim2 ON f.id = d2.fact_id
),
all AS (
SELECT * FROM d1
UNION ALL SELECT * FROM d2
),
SELECT attr1, attr2 FROM all
JOIN fact f ON fact_id = f.id
WHERE f.id = ${factId} AND x_date BETWEEN ${dateFrom} AND ${dateTo}
Suddenly it’s executing very slowly.
Looks like filter on x_date is not pushed down correctly.
I still haven’t unraveled the query plan but it looks like for dim2 it is no longer using index on dim2(fact_id, date) but instead hits fact(id).
Am I right to expect this to work? Is this a known limitation of optimizer?
PostgreSQL 15 (Aurora).