Lecture 23: Analysis of procedures, objects, and pointers
Analysis of object-oriented languages
* Fast and Effective Optimization of Statically Typed Object-Oriented Languages (includes description of Rapid Type Analysis algorithm)
Example of the problem
Given the following object-oriented code :
class A { m(){...} } class B extends A { m(){...} }
and given the following execution :
A x; x = new A(); x = new B(); x.m();
then we cannot know statically which function m will be called here.
Solving this problem is difficult. As we do not want to spend too much time on checking, the solution will be quite simple (but less than previous work).
Possible Methods (from simplest to more complex) :
- Trivial Analysis (TA) : We suppose that each method as a unique name
- Ckass Hierarchy Analysis (Cha) : We build the hierarchy and search in all these classes
- Rapid Type Analysis (RTA) : We look only at instanciated classes
RTA
In order to do this, one has first to determine the classes that can be called. This means that one has to build the Call Graph and bound it to the part we want to analyse. Producing the best graph can be arbitrarly complicated !
Problems :
- Using this graph we can analyze only this subset of the whole graph.
- It do not even take the order of the commands into account
RTA using Set Constraints
Defining
- M : the set of procedure bodies
- P : the set of call sites
- T : the types
We use the following functions :
- : all the program points of where a call occurs
- : the method that is called according to declared type
- : instanciated methods (contained in the instanciated type) in m
- : the method that overrides m
We can then introduce as the set variable of the called methods :
We also define as the set of all possibly called methods that we can build using the following formula :
Type Checking using Set Constraints
We want that our compiler detects correctly the possible subset of types for each function.
Example
Using the following grammar :
type form = Atom of string | And of form*form | Or of form*form | Not of form
we can have the following types : which is an infinite set (“_” represents any kind of type).
But if we define the following function :
let rec elimOr(f : form) : form = match f with S1: | Atom s -> Atom s S2: | And(f1,f2) -> And(elimOr f1, elimOr f2) S3: | Or(f1,f2) -> Not(And(Not(elimOr f1), Not(elimOr f2))) S4: | Not(f1) -> Not(elimOr f1)
It is clear that the output types of this function is a subset of (as the is removed). In fact, if we define as the resulting type of this function, can be computed by computing the fixpoint of :
In this particular case, the fixpoint is : which computes correctly the fact that the has been removed.
This example shows that we can use set constraints to create type systems ! Read the following papers for more information :
- Type inference. Semantics of untyped lambda calculus.
Modularity: Componential set-based analysis
Deciding set constraints
Some approaches:
- simplification algorithms - most efficient for tractable classes
- techniques specific to set constraints (using hypergraphs, The complexity of set constraints)
- reduction to monadic class of first-order logic (Set constraints are the monadic class)
- regular tree grammars and tree automata (Practical aspects of set based analysis)
- reduction to term algebras (First-order theory of structural subtyping)
Term Algebras
- : set of all ground terms
- First-order theory :
This is First-Order Logic (FOL) with variables ! We can relate this to set constraints
Monadic Class of FOL
MFOL1) has : only unary predicates, no function symbol, quatifiers and logic.
Translating this to our sets language :
F := S = S | S ⊆ S | card(S) ≥ k | card(S) = k | F ∧ F | ¬ F | ∃v.F S := v | S ∪ S | S ∩ S | S' | ∅ | 1
Where :
- is syntactic sugar to represent a unary operator, as is not a variable, that compare the cardinality of with
- is an integer
- is the complement set of
- represents the whole universe
In order to agree with our definitions of set constraints, we need to eliminate :
We then have to be able to represent the following formulas :
We can then perform quantifier elimination. Note that this is not possible in case we do not have the operator : \begin{equation*}\exists v.card(S_1 \cap v) = k_1 \wedge card(S_1 \cap v') = k_2 \Rightarrow card(S_1) = k_1 + k_2\]
After elimination of the quantifiers, deciding the result of the formula is trivial.