This example is taken from the Postgres regression test suite.
Setup:
CREATE TABLE J1_TBL (
i integer,
j integer,
t text
);
CREATE TABLE J2_TBL (
i integer,
k integer
);
INSERT INTO J1_TBL VALUES (1, 4, 'one');
INSERT INTO J1_TBL VALUES (2, 3, 'two');
INSERT INTO J1_TBL VALUES (3, 2, 'three');
INSERT INTO J1_TBL VALUES (4, 1, 'four');
INSERT INTO J1_TBL VALUES (5, 0, 'five');
INSERT INTO J1_TBL VALUES (6, 6, 'six');
INSERT INTO J1_TBL VALUES (7, 7, 'seven');
INSERT INTO J1_TBL VALUES (8, 8, 'eight');
INSERT INTO J1_TBL VALUES (0, NULL, 'zero');
INSERT INTO J1_TBL VALUES (NULL, NULL, 'null');
INSERT INTO J1_TBL VALUES (NULL, 0, 'zero');
INSERT INTO J2_TBL VALUES (1, -1);
INSERT INTO J2_TBL VALUES (2, 2);
INSERT INTO J2_TBL VALUES (3, -3);
INSERT INTO J2_TBL VALUES (2, 4);
INSERT INTO J2_TBL VALUES (5, -5);
INSERT INTO J2_TBL VALUES (5, -5);
INSERT INTO J2_TBL VALUES (0, NULL);
INSERT INTO J2_TBL VALUES (NULL, NULL);
INSERT INTO J2_TBL VALUES (NULL, 0);
So we see that both tables have 2 rows each with a NULL
in the first column, which is called i
on both tables.
Query:
SELECT *
FROM J1_TBL LEFT OUTER JOIN J2_TBL USING (i)
ORDER BY i, k, t;
Expected output:
i | j | t | k
---+---+-------+----
0 | | zero |
1 | 4 | one | -1
2 | 3 | two | 2
2 | 3 | two | 4
3 | 2 | three | -3
4 | 1 | four |
5 | 0 | five | -5
5 | 0 | five | -5
6 | 6 | six |
7 | 7 | seven |
8 | 8 | eight |
| | null |
| 0 | zero |
(13 rows)
With two rows on J1_TBL
where i
is NULL
, each matching two rows on J2_TBL
where i
is NULL
, I’d expect there to be 15 rows, with the bottom of the result table containing 4 rows where i
is NULL
. Why is this not the case?