It seems I’m sitting on my eyes, because I cannot find out whats wrong with my regular expression:
I’m trying to parse a C source file that contains strings within a line (actually those strings are regular expressions, but that shall be unimportant).
Basically /"([^"]*)"/
should capture the string (without quotes) unless there are double quotes in it.
I could match double-quotes in side the string using /(\"*)/!
too, but I failed to combine both regular expressions:
Using /"((?:[^"]|\")*)"/
(capture the text between double-quotes until the first non-escaped double-quotes) the capture ends after the first "
, as shown in this example debugger session:
DB<15> $x='"SAMSUNG SSD SM841N? (2\.5"? 7mm |mSATA )?(128|256|512)GB( SED)?|"'
DB<16> x $x =~ s/"((?:[^"]|\")*)"//
0 1
DB<17> x $x
0 '? 7mm |mSATA )?(128|256|512)GB( SED)?|"'
DB<18>
While writing this question, I tried to swap both alternatives, and suddenly it worked:
DB<18> $x='"SAMSUNG SSD SM841N? (2\.5"? 7mm |mSATA )?(128|256|512)GB( SED)?|"'
DB<19> x $x =~ s/"((?:\"|[^"])*)"//
0 1
DB<20> x $x
0 ''
DB<21>
So aren’t regular expressions A|B
and B|A
equivalent?