# Concrete vs Abstract Syntax Trees

## Concrete Syntax Tree

Given a grammar, a concrete syntax tree is a directed tree with ordered children, given like this:

- leaves are labeled by terminals (tokens)
- internal nodes are labelled by non-terminals
- if a node is labelled by non-terminal , then
- there must be a production such that
- the children of the node are labelled by , …, (in that order)

- root is the start symbol

Example grammar, with start symbol 'term':

term ::= factor | term "+" factor factor ::= INTLITERAL | IDENTIFIER

An example tree:

term(term(term(factor(IDENTIFIER("xx"))), "+", INTLITERAL(42)), "+", factor(IDENTIFIER("yy"))

This tree shows that

xx + 42 + yy

can be generated by the grammar.

The **yield** of the concrete syntax tree is the sequence of terminals that we obtain by traversing the tree in infix order.

The parse tree “proves” that the yield can be generated by the grammar.

Parse tree is more abstract than derivation: there are several derivation that result in the same concrete syntax tree, depending on the order in which we expand the non-terminals.

## Abstract Syntax Tree (AST)

These trees typically ignore simple tokens such as

- keywords
- parentheses
- operator symbols

Instead, the names of syntax tree nodes and their content carry sufficient information

abstract class Expression case class IntegerLiteral(v : Int) extends Expressioon case class Variable(id : String) extends Expression case class Plus(e1 : Expression, e2 : Expression) extends Expression case class Minus(e1 : Expression, e2 : Expression) extends Expression

## The Ordering of Operators

The order of operators in AST determines the meaning. Compare:

Minus(Minus(100,20),5) --> 75 Minus(100,Minus(20,5)) --> 85

Must be careful to ensure the desired priority and the order of application of operators.

left-associative operators:

def parseTerm : Expression = { var e = parseFactor while (lex.token = MINUS) { lex.next e = Minus(e, parseFactor) } e }

right-associative operator:

def parseTerm : Expression = { val e1 = parseFactor if (lex.token = EXPONENT) { lex.next val e2 = parseTerm Exponent(e1,e2) } else { e1 } }