LARA

Constraint-Based Type Checking and Inference

Motivation for Type Inference

Example:

 % y:int. let ((f:int->int) = % x:int. x+3) in f (f (y+5))

Bottom-up type checking concludes:

y:int
5:int
y+5:int
x:int
3:int
x+3:int
(% x:int. x+3): int->int
f:int->int
f(y+5):int
f(f (y+5)):int
(% y:int. let ((f:int->int) = ...) : int->int

What if we have expression $expr$ without explicit type declarations:

 % y. let (f= %x.x+3) in f (f (y+5))

Consider type constraints for %x.x+3. Assume

x:a
x+3:b
%x.x+3:c

where a,b,c are unknown types. Given that we know

3:int
+:int->int->int

for x+3 to type check, i must be a=int and b=int. By abstraction rule then c = a→b = int→int. Therefore, f:int→int.

Similarly y+5 can be type checked only if y:int, and in that case y+5:int. Using type of f, we conclude f(y+5):int and f(f(y+5)):int. The entire expression $expr$ therefore has type int→int.

Bottom-Up versus Constraint-Based

We have seen bottom-up type checking

  • obtain types for leaves
  • propagate them to the root of syntax tree
  • requires all lambda expressions to be typed
  • what if some variables are undeclared?
  • must propagate information in multiple directions

A more general, constraint-based approach:

  • introduce a type variable corresponding to each tree node
  • use type rules to describe constraints between variables
  • solve constraints

Possible outcomes:

  • no solutions: type error
  • unique solution: program is type correct
  • multiple solutions, either
    • report umbiguity
    • compute the most general type, allow expression use in many contexts

Lambda Calculus with Optional Types

  E ::= v | E E | %d.E | (E:T)
  d ::= v:T | v
  T ::= int | boolean | T -> T

Note:

  • declarations of types of variables are optional
  • we can optionally annotate other terms with their concrete types

Type Constraints for Lambda Calculus

Rules for type constraints are like type checking rules, but:

  • contain not only concrete types, but also types with type variables that denote unknown types
  • contain constraints between types

Type inference process

  • maintain in the environment a type variable for each bound variable
  • introduce a fresh type variable $t_e$ for each sub-expression $e$
  • compute constraints between type variables
  • solve type constraints

Constraints on type variables correspond to type checking rules

Variable Rule

If $(x,t_0) \in \Gamma$ and $t_1$ is the type introduced for an occurrence of $x$, then introduce constraints $t_0 = t_1$

Abstraction Rule

When generating constraints for $\lambda x.e$, introduce a variable for $t$ for $x$ and generate constraints for the body $e$ in environment $\Gamma \oplus \{ x \mapsto t \} \vdash e$. To the resulting expression assign type $t \to t_e$ where $t_e$ is the type variable for $e$.

Application Rule

  • $e_1:t_1$
  • $e_2:t_2$
  • $(e_1 e_2): t_3$

then $t_1 = t_2 \to t_3$

Type Annotation Rule

When expression $e$ is explicitly typed $t$, then add $v_e = t$ to constraints, where $v_e$ is variable for expression $e$

Solving Type Constraints

Type constraints have the form $t=t'$ where $t,t'$ are type expressions containing:

  • ground (primitive) types int, boolean
  • type variables
  • function arrow constructor

Such constraints can be solved using Unification

  • two types are equal iff variables can be instantiated to obtain same abstract syntax tree for types