I’m in the process of creating my own programming language, which I do for learning purposes. I already wrote the lexer and a recursive descent parser for a subset of my language (I currently support mathematical expressions, such as + - * /
and parenthesis). The parser hands me back an Abstract Syntax Tree, on which I call the Evaluate
method to get the result of the expression. Everything works fine. Here is approximately my current situation (Code examples in C#, although this is pretty much language-agnostic):
public abstract class Node
{
public abstract Double Evaluate();
}
public class OperationNode : Node
{
public Node Left { get; set; }
private String Operator { get; set; }
private Node Right { get; set; }
public Double Evaluate()
{
if (Operator == "+")
return Left.Evaluate() + Right.Evaluate();
//Same logic for the other operators
}
}
public class NumberNode : Node
{
public Double Value { get; set; }
public Double Evaluate()
{
return Value;
}
}
However, I would like to decouple the algorithm from the tree nodes because I want to apply Open/Closed Principle so I don’t have to reopen every node class when I want to implement code generation for example. I read that the Visitor Pattern is good for that. I have a good understanding of how the pattern works and that using double dispatch is the way to go. But due to the recursive nature of the tree, I’m not sure how I should approach it. Here is what my Visitor would look like:
public class AstEvaluationVisitor
{
public void VisitOperation(OperationNode node)
{
// Here is where I operate on the operation node.
// How do I implement this method?
// OperationNode has two child nodes, which may have other children
// How do I work the Visitor Pattern around a recursive structure?
// Should I access children nodes here and call their Accept method so they get visited?
// Or should their Accept method be called from their parent's Accept?
}
// Other Visit implementation by Node type
}
So this is my problem. I want to tackle it immediately while my language does not support a lot of functionnality to avoid having a bigger problem later.
I did not post this to StackOverflow because I don’t want you to provide an implementation. I only want you to share ideas and concepts that I might have missed, and how I should approach this.
2
It is up to the visitor implementation to decide whether to visit child nodes and in which order. That’s the whole point of the visitor pattern.
In order to adapt the visitor for more situations it is helpful (and quite common) to use generics like this (it’s Java):
public interface ExpressionNodeVisitor<R, P> {
R visitNumber(NumberNode number, P p);
R visitBinary(BinaryNode expression, P p);
// ...
}
And an accept
method would look like this:
public interface ExpressionNode extends Node {
<R, P> R accept(ExpressionNodeVisitor<R, P> visitor, P p);
// ...
}
This allows to pass additional parameters to visitor and retrieve a result from it. So, the expression evaluation can be implemented like this:
public class EvaluatingVisitor
implements ExpressionNodeVisitor<Double, Void> {
public Double visitNumber(NumberNode number, Void p) {
// Parse the number and return it.
return Double.valueOf(number.getText());
}
public Double visitBinary(BinaryNode binary, Void p) {
switch (binary.getOperator()) {
case '+':
return binary.getLeftOperand().accept(this, p)
+ binary.getRightOperand().accept(this, p);
// More cases for other operators here.
}
}
}
The accept
method parameter isn’t used in the above example, but just believe me: it is quite useful to have one. For example, it can be a Logger instance to report errors to.
1
I have implemented the visitor pattern on a recursive tree before.
My particular recursive data structure was extremely simple – just three node types: the generic node, an internal node that has children, and a leaf node that has data. This is much simpler than I expect your AST to be, but perhaps the ideas can scale.
In my case I deliberately did not let the Accept of a node with children call Accept on its children, or to call visitor.Visit(child) from within the Accept. It is the responsibility of the correct “Visit” member implementation of the visitor to delegate Accepts to children of the node being visited. I chose this way because I wanted to allow different Visitor implementations to be able to decide the order of visitation independently of the tree representation.
A secondary benefit is that there are almost no artifacts of the Visitor pattern inside my tree nodes – each “Accept” just calls “Visit” on the visitor with the correct concrete type. This makes it easier to locate and understand the visiting logic, it’s all inside the visitor implementation.
For clarity I’ve added some C++-ish pseudocode. First the nodes:
class INode {
public:
virtual void Accept(IVisitor& i_visitor) = 0;
};
class NodeWithChildren : public INode {
public:
virtual void Accept(IVisitor& i_visitor) override {
i_visitor.Visit(*this);
}
// Plus interface for getting the children, exercise for the reader ;-)
};
class LeafNode : public INode {
public:
virtual void Accept(IVisitor& i_visitor) override {
i_visitor.Visit(*this);
}
};
And the visitor:
class IVisitor {
public:
virtual void Visit(NodeWithChildren& i_node) = 0;
virtual void Visit(LeafNode& i_node) = 0;
};
class ConcreteVisitor : public IVisitor
public:
virtual void Visit(NodeWithChildren& i_node) override {
// Do something useful, then...
for(Node * p_child : i_node) {
child->Accept(*this);
}
}
virtual void Visit(LeafNode& i_node) override {
// Just do something useful, there are no children.
}
};
3
You work the visitor pattern around a recursive structure the same way you would do anything else with your recursive structure: by visiting the nodes in your structure recursively.
public class OperationNode
{
public int SomeProperty { get; set; }
public List<OperationNode> Children { get; set; }
}
public static void VisitNode(OperationNode node)
{
... Visit this node
foreach(var node in Children)
{
VisitNode(node);
}
}
public static void VisitAllNodes()
{
VisitNode(rootNode);
}
4