Concrete vs Abstract Syntax Trees

Concrete Syntax Tree

Given a grammar, a concrete syntax tree is a directed tree with ordered children, given like this:

• leaves are labeled by terminals (tokens)
• internal nodes are labelled by non-terminals
• if a node is labelled by non-terminal , then
• there must be a production such that
• the children of the node are labelled by , …, (in that order)
• root is the start symbol

Example grammar, with start symbol 'term':

```term ::= factor | term "+" factor
factor ::= INTLITERAL | IDENTIFIER```

An example tree:

```term(term(term(factor(IDENTIFIER("xx"))),
"+",
INTLITERAL(42)),
"+",
factor(IDENTIFIER("yy"))```

This tree shows that

`xx + 42 + yy`

can be generated by the grammar.

The yield of the concrete syntax tree is the sequence of terminals that we obtain by traversing the tree in infix order.

The parse tree “proves” that the yield can be generated by the grammar.

Parse tree is more abstract than derivation: there are several derivation that result in the same concrete syntax tree, depending on the order in which we expand the non-terminals.

Abstract Syntax Tree (AST)

These trees typically ignore simple tokens such as

• keywords
• parentheses
• operator symbols

Instead, the names of syntax tree nodes and their content carry sufficient information

```abstract class Expression
case class IntegerLiteral(v : Int) extends Expressioon
case class Variable(id : String) extends Expression
case class Plus(e1 : Expression, e2 : Expression) extends Expression
case class Minus(e1 : Expression, e2 : Expression) extends Expression```

The Ordering of Operators

The order of operators in AST determines the meaning. Compare:

```Minus(Minus(100,20),5)   --> 75
Minus(100,Minus(20,5))   --> 85```

Must be careful to ensure the desired priority and the order of application of operators.

left-associative operators:

```def parseTerm : Expression = {
var e = parseFactor
while (lex.token = MINUS) {
lex.next
e = Minus(e, parseFactor)
}
e
}```

right-associative operator:

```def parseTerm : Expression = {
val e1 = parseFactor
if (lex.token = EXPONENT) {
lex.next
val e2 = parseTerm
Exponent(e1,e2)
} else {
e1
}
}```