I’m trying to extract Latex code that’s embedded in webpage HTML.
If we take take the string:
string = ‘<math xmlns=”http://www.w3.org/1998/Math/MathML” display=”block”><semantics><mrow><mtext> </mtext><mfrac><mrow><mi>a</mi><mi>x</mi><mo>+</mo><mi>b</mi></mrow><mi>c</mi></mfrac><mo>=</mo><mi>d</mi></mrow><annotation encoding=”application/x-tex”> frac{ax+b}{c}=d</annotation></semantics></math>’
I want just the ” frac{ax+b}{c}=d”
When I ran the code:
match = re.search(r’x-tex”>.*</an’,string)
print(str(match.group(0)) )
I got a match of x-tex”> rac{ax+b}{c}=d</an
It omitted the f character from the match string.
Does anyone have an idea of how I can include it in the match?