I have multiple databases with identical structure but different data. I am running my query against them, and for some databases the query works quickly and for some it is slow.
This is the query:
SELECT *
FROM table_a
LEFT JOIN table_b ON table_b.id = table_a.table_b_id
WHERE table_b.id IS NOT NULL
ORDER BY table_a.id
LIMIT 100
Table A is joined to Table B, and I only need the rows where the corresponding entry in the Table B exists. The column table_b_id
does not have a foreign key constraint.
Most of the time the query works well, and this is what EXPLAIN
shows me:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | table_a | index | ix_table_a_table_b_id | PRIMARY | 1002 | 100 | 100.0 | |||
1 | SIMPLE | table_b | eq_ref | PRIMARY | PRIMARY | 1002 | table_a.table_b_id | 1 | 100.0 | Using where |
This makes sense: using primary key of Table A, get first 100 rows from Table A, then get the ids that reference Table B, then get rows with these ids from Table B using its primary key.
However, for some databases the query is slow and this is the EXPLAIN
output:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | table_b | ALL | PRIMARY | 7842 | 90.0 | Using where; Using temporary; Using filesort | ||||
1 | SIMPLE | table_a | ref | ix_table_a_table_b_id | ix_table_a_table_b_id | 1003 | table_b.id | 87 | 100.0 |
If I explicitly tell MySQL to use the PRIMARY
indices, the query works quickly on all databases (and EXPLAIN
output is the same on all databases too):
SELECT *
FROM table_a USE INDEX (PRIMARY)
LEFT JOIN table_b USE INDEX (PRIMARY)
ON table_b.id = table_a.table_b_id
WHERE table_b.id IS NOT NULL
ORDER BY table_a.id
LIMIT 100
What am I doing wrong to make the optimizer fail in such a way? Is there really no better solution than to explicitly tell it to use specific indices?