I’m looking for efficient strategies to optimize a MySQL query that generates Cartesian products in scenarios involving large data sets. The specific use case involves pairing cast and crew members for each movie in a database, which can result in a very large number of combinations due to the size of the data involved.
The current query is as follows:
SELECT
m.title,
pc.person_name AS cast_member,
pr.person_name AS crew_member
FROM
movie m
JOIN
movie_cast mc ON m.movie_id = mc.movie_id
JOIN
person pc ON mc.person_id = pc.person_id
JOIN
movie_crew mcc ON m.movie_id = mcc.movie_id
JOIN
person pr ON mcc.person_id = pr.person_id;
I’ve considered using derived tables with the LATERAL clause to make this more efficient, but it led to inaccurate results. Are there other optimization techniques or approaches in MySQL that could improve the performance of this query?
Please note that this query is used purely for research purposes.