I have an input string in html that needs to be parsed and written to DITA compatible XML.
Input:
<p>Line with following newline<br>Line with two following newlines<br><br>Line with no following newline</p>
Desired Output:
<p>Line with following newline<?linebreak?>Line with two following newlines<?linebreak?><?linebreak?>Line with no following newline</p>
package require tdom
set xml {<p>Line with following newline<br>Line with two following newlines<br><br>Line with no following newline</p>}
puts "Input:"
puts "$xml"
set doc [dom parse -html -keepEmpties $xml]
set root [$doc documentElement]
foreach node [$root getElementsByTagName br] {
$node delete
#$node appendXML "<?linebreak?>"
}
puts "Output:"
puts [$doc asXML -indent none]
If I uncomment #$node appendXML "<?linebreak?>"
, the script fails. I’m new to tdom but not tcl. Or….maybe someone has a different idea on how to preserve linebreaks in XML, specifically DITA.
New contributor
user32089 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.