Lecture 22: Set constraints
Introduction
We can view analysis and verification as two stage process
program + properties => system of equations/formulas => solution
In general, the system of equations (for a program expressed as one loop) is an approximation of the condition
where is the set of reachable states and the code of the program.
We have seen earlier
- solution for programs that have no loops (verification condition generation): is given as formula
- iterative solutions based on finite height lattice abstractions: computed iteratively
- constraint-based analysis based on linear inequality templates: postulate coefficients for of the form , solve for them
Today we look at another constraint based analysis, which deals with non-numeric properties and is based on set constraints.
Two problems:
- how to generate set constraints for a given analysis
- how to solve set constraints (techniques often reusable for different analysis)
Set constraints can be naturally used to express
- interprocedural abstract interpretation and dataflow analysis
- type inference
Different axes for classifying analyses:
- context sensitivity: does not produce false paths in procedure calls/returns
- flow sensitivity: taking into account the ordering of statements within a procedure, taking not into account the ordering may seem strange but is useful to fasten the analysis of large programs
- field sensitivity: when analyzing data structures, y=x.f and y=x.g can be treated like members of the same object x
- non-relational (independent attribute, Cartesian) VS. relational (dependent attribute)
- example : non-relational VS. relational
Set constraints are suitable for non-relational analyses, whether they are flow sensitive or not and whether they are context-sensitive or not.
Type systems: context-sensitive (e.g. generics), work for higher-order programs (and callbacks and function pointers). Often flow insensitive (unless they are “typestate” or “flow sensitive type qualifiers”). Sometimes designed for purely functional programs (higher order imperative programs are difficult to reason about).
Notion of set constraints
Syntax
We can define the subset constraints between expressions denoting sets of terms using the following grammar :
where is a constant, is a terminal (it can be a variable or function symbol), is an application function and is the projection of th parameter of .
Note: set difference () as it is not monotonic, so its use is restricted if we want to ensure existence of solutions.
Semantic
We define the semantic of our syntax recursively by defining each of its elements over a language . We first have that :
where is the Herbrand univers, or “set of all ground terms”, or also called the “term model”. As we already saw, it's the set of all possibly constructed trees using language :
Given that, if the meaning of is given (i.e. we have ), then we can estimate the meaning of our language as following :
Example
Using this example we can already see how we could use set constraints to analyze procedure calls as can be viewed as a procedure call on which we apply possible parameters.
An important property we would like to have in this semantic is a very intuitive one :
This property is in fact not conserved as we can easily see using a very simple counter-example :
The correct interpretation of this property is :
Live variable analysis using set constraints
A variable is said to ba alive when it's value will be reused later in the program.
Example :
x = 1; y = 5; z = 7; z = x + y;
in this example, only variables x and y are alive.
Example of use:
- register allocation
- related problems arise in memory preallocation
In order to compute the set of live variable, let's take the same example as before to use it to deduce the update rule. We can see our program as the following tree :
From this we can deduce the following set equations that define the relation between the sets of active variables at different program points :
Where as they are used in and as it's the newly defined variable in . Here we do not have a problem with the monotonicity of the \ operator as is not a variable.
Control-flow analysis | Closure Analysis
Example of the problem
In a functional language, f is a variable that pass a function. In this context, to which function (f x y) refers to ? In order to answer this question, we have to keep track of who called f.
- This analysis is handled by the calculus. As a reminder, it has the following grammar : where represent the application of the function (the left one) to its argument
- We can rename functions and variables such that they are unique
- We can also add to say that it exists
- We can then convert these expressions to set constraints :
where is a part of the program and is a constant.
For the case of application, the question is : Can describe the expression ?
Example We will try to see if this is the case using .
From this we can deduce, as we defined previously, the following contraints :
Which leads to the following solutions :
Which shows that the variable can described our test expression.
Conditional Constraints
Now that we know how to convert the \lambda calculus into set constraints, we need to be able to express into set constraints too as it is not part of our gramar.
The general form of the conditional constraints is the following : .
We can transform it like this :
Which has the correct meaning as we have :
Shape analysis of algebraic data types
Not seen in class, have a look at the papers !
Using unary functions for interprocedural analysis
Not seen in class, have a look at the papers !
Solving set constraints
Not seen in class, have a look at the papers !