Notes for cs164 Section 1
- abstract syntax trees (ASTs)
- Abstract away irrelevant details of concrete syntax, eg. comments,
whitespace, parens. For example,
if x == y then
x = 0; else y = 0;
and if (x
== y) then { x = 0; } else { y = 0; }
are
essentially the same program, even though they slightly differ
textually. The ASTs for these two programs will be identical.
- PA1 handout has an example of some code and its corresponding
AST.
- Implementation: Composite pattern (see Design Patterns for
details). Nodes in tree have a common superclass, ASTNode, allowing
for inheritance of basic tree code (eg. traversing children). Each
node defines accessor methods for its children. Draw object
diagram w/ ASTNode and some subclass, eg. InfixExpression.
- The Visitor pattern
- If we can edit AST source code, we can add operations by adding a
method to
ASTNode
and all subclasses,
eg. interpret()
or prettyPrint()
.
- What if we can't edit the AST code? We must define new operations
externally. Consider
interpret()
:
- Put all
interpret()
methods together. Since
they aren't inside AST node classes anymore, need to pass AST node
as parameter.
interpret(InfixExpression e) {
interpret(e.getLeftOp()); interpret(e.getRightOp()); ... }
interpret(WhileStatement w) { ... }
- Problem: Java method dispatch based on declared type of
arguments. Example (
InfixExpression
extends
ASTNode
):
ASTNode n = new ASTNode();
InfixExpression e = new InfixExpression();
ASTNode n2 = new InfixExpression();
interpret(n); // calls interpret(ASTNode)
interpret(e); // calls interpret(InfixExpression)
interpret(n2); // calls interpret(ASTNode)!
So, calling interpret(e.getLeftOp())
above
may not invoke the correct interpret()
method. We need
to add explicit dispatch code:
if (e.getLeftOp() instanceof InfixExpression)
{ interpret((InfixExpression)e.getLeftOp()); }
else if (e.getLeftOp() instanceof NumberLiteral)
{ interpret((NumberLiteral)e.getLeftOp()); }
...
This is ugly!
- A better solution: AST designed with a "hook" allowing for new
operations to easily be added externally. Each AST node class has an
accept()
method that
takes a visitor object as a parameter. Each visitor class
contains some operation to be performed on the AST, defined in
visit()
methods for each type of AST node. The
accept()
methods are simple:
accept(Visitor v) { v.visit(this);
}
Since the declared type of this
is always the type of
the enclosing class, the correct visit()
method will be
invoked. Written as a visitor, the original
interpret()
method for InfixExpression
looks like this:
visit(InfixExpression e) { e.getLeftOp().accept(this);
e.getRightOp().accept(this); ... }
- Other advantages of visitor: all the code for an operation is in
one place, avoids cluttering interface of AST nodes, easy to share
state between methods (through fields).
A pretty printer
- Note that associativity is represented directly in AST (show example). When
reproducing source text from an AST, may have to add parentheses to
maintain associativity (in PA1 pretty printer, always add
parens).
The interpreter
- Arithmetic expressions
- Assignment: need to keep track of values of assigned variables in a
symbol table.
- If and while: don't necessarily visit all children