I have a sort of AST defined in XML that i’m trying to parse and evaluate. The XML tree contains the tokens and all the information i need. However, i’m finding it difficult to do it “properly”. Here’s what i have:
<node type="operation">
<value name="OP">AND</value>
<value name="A">
<node type="comparison">
<value name="OP">EQ</value>
<value name="A">1</value>
<value name="B">1</value>
</node>
</value>
<value name="B">
<node type="comparison">
<value name="OP">EQ</value>
<value name="A">2</value>
<value name="B">2</value>
</node>
</value>
</node>
def getNode() {
switch (xmlNode[type]) {
case 'operation':
new OperationNode(xmlNode[value.OP], getNode(xmlNode[value.A]), getNode(xmlNode[value.B]))
case 'comparison':
new ComparisonNode(xmlNode[value.OP], xmlNode[value.A], xmlNode[value.B])
}
}
nodes = getNode(xmlTree)
nodes.evaluate()
That’s pseudo-code, but you can get an idea of what i’m doing. Basically, i recursively go into every node element of the XML, discover what it is based on my switch-case, and then create the proper object in my domain, so i can later evaluate all the nodes together.
But i have a few questions:
- How can i get rid of the switch? Seems very dirty to me, but i can’t think of a better solution
- How can i evaluate all nodes elegantly? Right now, all nodes inherit from a base
Node
class, which has aneval()
method, which by polymorphism, i endup having all the different behaviors between conditions and operations. - Is there a more efficient way of traversing this XML?
Depending on your implementation language, there may be dynamic mappings or code generators to map XML to your object graph.
If you don’t want to go that route, I’d suggest using an event based parser rather than creating the XML document, then creating your AST from the events instead using recursive descent. So when you see the start of the value
element you call a function which returns a ValueNode
and let the call stack manage the stack. (this assumes the call stack is large enough for your deepest nested expression, which is probably true in practice but does open up a failure mode if someone wants to create a nasty XML document).
If you do go down the hand-coded parsing route, you won’t get away from switching on the element tag name. You might hide it using a map of tag name to function to call, but if your implementation language supports switch on strings then that’s unlikely to improve readability much.