Currently, I am creating a PHP script to facilitate the inclusion of new SQL code snippets into an SQL database. To achieve this, I began by trying to split the SQL into parts to assemble it later.
SELECT (group 1, any thing here, including subqueries)
FROM (group 2, table name with alias)
(
group 3 all joins including subqueries
)
WHERE (group 4 grab all conditionals including subqueries)
GROUP BY (group 5 grab everything here)
HAVING (group 6 grab everything here)
ORDER BY (group 7 grab everything here)
LIMIT (group 8 grab everything here)
Initially, I managed to come up with this regex that can capture up to the alias part. However, I’m not exactly sure now how to capture the other parts. I added a lookahead to capture the complex parts that may be between the SELECT … FROM of the main query, but I’m not exactly sure how to handle the rest.
I think my main difficulty is actually maintaining the regex search in the main query and not looking at possible subqueries.
^bSELECTb(.*?)bFROMbs+(w+(?: (?:as )?w+)*)(?=s*(?:left|right|inner|outer|join|where|group|order|s|$))
1