I am trying to parse an XML string that contains Windows newlines (the CR, LF pair):
from lxml.etree import XML
root = XML('<root>_rn_n_</root>')
print(
[ord(char) for char in root.text],
)
but the resulting text surprisingly contains only Linux newlines (LF character):
[95, 10, 95, 10, 95]
Is it a feature, documented somewhere? Is it possible to change its behavior to access the unmodified text?
I am using (currently the newest) LXML 5.2.2.