This is a follow-up question on A working RegEx match that begins at the first of two OR-words takes the unwanted last OR-word instead if I place more RegEx before it [duplicate]. Thanks go to the helpful remarks of this user and this user under that question.
I got:
"(?>=SELECTs+|,s*)(.+)s+ASs+Fehler.*?(?<=WHERE|HAVING)(.*)"
Or without any lookarounds:
"(?:=SELECTs+|,s*)(.+?)s+ASs+Fehler.*?((?:WHERE|HAVING).*)"
as the needed RegEx for:
SELECT a, bc d AS Error FROM y WHERE 1=1 HAVING 1=1
to be split into two matching groups:
-
bc d
as the code that builds the column called “Error”. This is dummy code, do not care about the SQL. -
Anything after one of the two words “WHERE” or “HAVING”.
Yet, having tested everything with the default language and flavour of www.regex101.com, which is PCRE2 (PHP >=7.3)
at the time of writing, it did not work in MS Access VBA, which throws for any pattern with a lookaround:
Microsoft Visual Basic – Run-time error ‘5017’: Application-defined or object-defined error.
With Lookarounds:
regexPattern = "(?>=SELECTs+|,s*)(.+)s+ASs+Fehler.*?(?<=WHERE|HAVING)(.*)" '-> error 5017 pop up
regexPattern = "(?<=WHERE|HAVING)(.*)" '-> error 5017 pop up
regexPattern = "(?>=SELECTs+|,s*)(.+)s+ASs+Fehler.*" '-> error 5017 pop up
Without lookarounds, it seems to work in MS Access:
regexPattern = "(?:=SELECTs+|,s*)(.+?)s+ASs+Fehler.*?((?:WHERE|HAVING).*)" ' -> no error 5017, but no matches either
regexPattern = "((?:WHERE|HAVING).*)" '-> working
regexPattern = "(?:=SELECTs+|,s*)(.+?)s+ASs+Fehler" '-> working
I tested it on a bigger SQL and found no matches again for (?:=SELECTs+|,s*)(.+?)s+ASs+Fehler.*?((?:WHERE|HAVING).*)
, and no matches in Regex101 either. I need a Regex that finds the two matches as shown above and stops before the order by
, and works in MS Access VBA:
SELECT test1.col1, test1.col2, test1.col3, test1.col4, "aaa" AS Error INTO test2
FROM test1
WHERE (((test1.col2)="S") AND ((test1.col4)="bbb")) OR (((test1.col2)="S") AND ((test1.col4)="ccc")) OR (((test1.col2)="S") AND ((test1.col4) Is Null))
HAVING 1=1
ORDER BY test1.col2;
What is the RegEx that works in MS Access VBA
- to get the two matching groups of above:
Error
column definition without theAS Error
tail,WHERE
/HAVING
clause, beginning with the earliest of the two,
- allows one of the two or both matches not to be there at all if the SQL code does not have the needed keywords,
- does not stop at line breaks,
- and stops before
ORDER BY
?
Since it is MS Access VBA, lookaheads and lookbehinds are not allowed, as it seems.
The needed RegEx is:
(?:(?:=SELECTs+|,s*)([^,]+?)s+ASs+Error[sS]*?)?(?:((?:WHERE|HAVING)[sS]*)?)(?:ORDER.*)
See https://regex101.com/r/86VUKs/1
Group 1:
"aaa"
Group 2:
WHERE (((test1.col2)="S") AND ((test1.col4)="bbb")) OR (((test1.col2)="S") AND ((test1.col4)="ccc")) OR (((test1.col2)="S") AND ((test1.col4) Is Null))
The [sS]
instead of .
is needed to also grab line breaks.
If I change the WHERE
to abcd
, the HAVING
clause found as the next:
And if I change the HAVING
to abcd
as well, nothing is found and the pattern can handle that:
If I now also change Error
to abcd
, the pattern does not find any group anymore:
Thus, this works. I then tested it in MS Access where it worked as well.