Lecture 2

Will soon post a clean PDF version of our notes

Manually proving an example

For software verification. Programs must be represented as logical formulas. These logical formulas can be used for proving invariants. This lecture explains the base theory about Propositional logic, First Order logic and Relations.

Recall our example from Lecture 1, with a loop invariant:

    public static int sum(int a0, int n0)
   /*:
     requires "a0 > 0 & n0 > 0"
     ensures "result > 0"
   */
    {
        int res = 0, a = a0, n = n0;
        while //: inv "a > 0 & res >= 0 & (n = n0 | res > 0)"
            (n > 0)
        {
            a = 2*a;
            res = res + a;
            n = n - 1;
        }
        return res;
    }

To prove ensured properties in the program, one first has to prove that this program will terminate or not, and then by showing that the program doesn't violate the required property prove it is complete. In case of loops in the program, an invariant of a loop in terms of a logical formula is needed such that after any number of iterations of the loop the invariant continues to hold. To prove the invariant for the above while loop, we have to show that after each iteration of the loop this invariant still holds. As we can observe the following logical formula shows that above invariant is true.

$\begin{array}{l} a > 0 \wedge res \geq 0 \wedge (n = n0 \lor res > 0) \rightarrow \\ 2a \land res + 2a >= 0 \wedge (n - 1 = n_0 \lor res + 2a \geq 0) \end{array}$

Logical preliminaries

We used these constructs in mathematics all the time. Formalizing them is useful for

foundation of mathematics
proving formulas automatically: need to represent formulas and make correct reasoning steps on them

Propositional logic

Basic propositional operations: $\land, \lor, \lnot, \rightarrow, \leftrightarrow$

A formula of the propositional calculus is a string generated by the following grammar:

$\begin{array}{l} F ::= p\ |\ F_1 \wedge F_2\ |\ F_1 \vee F_2\ |\ \lnot F\ |\ F_1 \rightarrow F_2\ |\ F_1 \leftrightarrow F_2 \end{array}$

We can simplify this grammar and only use conjunction, disjuntion and negation by observing that the implication and equivalency relations can be expressed as:

$\begin{array}{l} F_1 \rightarrow F_2 \Longleftrightarrow \lnot F_1 \vee F_2\\ F_1 \leftrightarrow F_2 \Longleftrightarrow (F_1 \rightarrow F_2) \wedge (F_2 \rightarrow F_1)\\ \end{array}$

Example formula:

$((p \rightarrow q) \land (\lnot p \rightarrow r)) \leftrightarrow ((p \land q) \lor (\lnot p \land r))$

We further define the interpretation of a formula. Let F be a propositional formula and let $\{ p_1, \dots, p_n\}$ be the set of atoms appearing in F. An interpretation for F is a function $e : \{p_1, \dots, p_n\} \rightarrow \{true, false\}$ , that is, e assigns one of the truth values true or false to each atom.

$\begin{array}{l} \textrm{Suppose}\ e(p) = true\ \textrm{and}\ e(q) = false,\ \textrm{then we have:}\\ \llbracket p \wedge q \rrbracket e =\llbracket p \rrbracket e \wedge \llbracket q \rrbracket e = true \wedge false = false\\ \end{array}$

we can also write the following:

$\begin{array}{l} \\ \llbracket F_1 \wedge F_2 \rrbracket e =\llbracket F_1 \rrbracket e \wedge \llbracket F_2 \rrbracket e\ \textrm{(for all conditions)}\\ \llbracket F_1 \rightarrow F_2 \rrbracket e = \llbracket F_1 \rrbracket e \rightarrow \llbracket F_2 \rrbracket\\ \end{array}$ where $\rightarrow$ is in fact a binary math function.

$\begin{array}{l} \llbracket \lnot F \rrbracket e = \lnot\llbracket F \rrbracket e\\ \end{array}$

We next give the definitions of satisfiability and validity: a propositional formula F is satisfiable if its value is true in some intepretation. A satisfying interpretation is called a model for F. F is valid if its value is true in all interpretations – a valid propositional formula is also called a tautology. The notation is: $\vDash F.$

Is “false implies true” a valid formula?

Checking satisfiability is done by using a SAT solver. Now the question that arises is how to check for validity using a SAT solver ?

There are two ways of tackling the problem. One would be to notice that there are $2^n$ possibilities in total and just check for everyone of them if $\lnot(\exists e \llbracket \lnot F \rrbracket e = true)$ . The other is to convert to CNF or DNF and then use the existing techniques for solving.

SAT solvers: http://www.satlive.org/

Conjunctive Normal Form. Disjunctive Normal Form

We give the grammars for conjunctive normal form (CNF) and disjunctive normal form (DNF) as:

$\begin{array}{l} L ::= p\ |\ \lnot p\\ C ::= L\ |\ C_1 \vee C_2\\ F ::= C\ |\ F_1 \wedge F_2\\ \end{array}$

is the grammar for CNF, and

$\begin{array}{l} L ::= p\ |\ \lnot p\\ C ::= L\ |\ C_1 \wedge C_2\\ F ::= C\ |\ F_1 \vee F_2\\ \end{array}$

is the grammar for DNF.

First-order logic (predicate calculus)

Predicate calculus is interesting because it supports quantifiers ( $\forall,\,\exists$ symbols). We define a predicate as a function from some arguments to a boolean value.

Example: “p divides q” is a binary predicate on p and q.

Predicate calculus is concerned with properties of predicates (regardless of what these predicates are).

Constructs in predicate calculus:

$\forall, \exists, P(t_1,t_2), t_3=f(t_4)$

The syntax for first-order logic is given by:

$\begin{array}{l} F ::= A\ |\ F_1 \wedge F_2\ |\ F_1 \vee F_2\ |\ \lnot F\ |\ \forall x.F\ |\ \exists x.F\\ A ::= P(t_1, t_2)\ |\ P(t)\ |\ t_1 = t_2\\ t ::= x\ |\ y\ |\ f(t)\ |\ h(t_1, t_2, \dots)\\ ex.: \forall x,\ \exists y\ such\ that\ x<y \end{array}$

Drinker's paradox (made more politically correct): in every restaurant there is a guest such that if this guest is having a dessert, then everyone in the restaurant is having a dessert.

Suppose predicate P(y) denotes that guest y is having a dessert. Then we can formally write the drinker's paradox as:

$\begin{array}{l} \exists e.(P(x) \rightarrow \forall y.P(y)) \end{array}$

To prove the drinker's paradox formula holds true we have to notice that it can be reduced to a simple implication. Although implication is confusing in language, in logics we must note that P \rightarrow Q is false only if $P=true \wedge Q=false$ .

Semantics for First-order logics

$\begin{array}{l} \llbracket P(x) \rrbracket e = e(x) \in \llbracket P \rrbracket e\\ \\ \llbracket R(x, y) \rrbracket e = e(x), e(y) \in \llbracket R \rrbracket e\\ \\ \llbracket F_1 \wedge F_2 \rrbracket e = \llbracket F_1 \rrbracket e \wedge \llbracket F_2 \rrbracket e\\ \\ \end{array}$

for quantifiers we have:

$\begin{array}{l} \llbracket \forall x.F \rrbracket e = \forall d.\llbracket F \rrbracket (e[x\mapsto d])\\ \end{array}$

$\begin{array}{l} \llbracket \exists x.F \rrbracket e = \exists d \in D.\llbracket F \rrbracket (e[x\mapsto d])\\ \\ \exists x.P(x) \Longleftrightarrow \lnot\forall x.\lnot P(x)\\ \lnot\exists x.P(x) \Longleftrightarrow \forall x.\lnot P(x)\\ \lnot\forall x.P(x) \Longleftrightarrow \exists x.\lnot P(x)\\ \lnot(F_1 \wedge F_2) \Longleftrightarrow \lnot F_1 \vee \lnot F_2\\ \lnot(F_1 \vee F_2) \Longleftrightarrow \lnot F_1 \wedge \lnot F_2\\ \end{array}$

Tarskian semantics for first-order logic.

Is there an algorithm for first-order logic?

Checking (and enumerating) proofs.

What is a formal proof?

Sets and relations

A set can be thought of as any collection of distinct objects considered as a whole. For example,

$\begin{equation*} \begin{array}{l} A = \{2,4,6\}\\ B = \{4,5\} \end{array} \end{equation*}$

We can define many operations over sets such as.

$\begin{equation*} \begin{array}{l} A \cup B = \{ 2,4,5,6 \}\\ A \cap B = \{4\}\\ A / B = \{2,6\} \end{array} \end{equation*}$

The Cartesian product is a direct product of sets. Specifically, the Cartesian product of two sets X and Y, denoted X × Y, is the set of all possible ordered pairs whose first component is a member of X and whose second component is a member of Y:

$\begin{equation*} \begin{array}{l} A \times B = \{(a,b)| a \in A \wedge b \in B\} \end{array} \end{equation*}$

We will use binary relations on program states to represent the meaning of programs. A binary relation on set S is a subset of $S \times S$ .

The diagonal relation is $\Delta_S = \{(x,x) \mid x \in S \}$ .

Because relation is a set, all set operations apply. In addition, we can introduce relation composition, denoted by circle

$\begin{equation*} r \circ s = \{(x,z) \mid \exists y.\ (x,y) \in r \land (y,z) \in s \} \end{equation*}$

Then

$\begin{equation*} r^n = r \circ \ldots \circ r \ \ \ \ \mbox{($n$ times)} \end{equation*}$

$\begin{equation*} r^* = \Delta_S \cup r \cup r^2 \cup r^3 \cup \ldots \end{equation*}$

Sets and predicates are interchangable.

$x \in P$ turns set into predicate.

$\{ x \mid P(x)\}$ turns predicate into a set.

$\begin{equation*} \{ x \mid P_1(x) \} \cap \{ x \mid P_2(x) \} = \{ x \mid P_1(x) \land P_2(x) \} \end{equation*}$

$\begin{equation*} (x \in S_1 \cap S_2) \leftrightarrow (x \in S_1 \land x \in S_2) \end{equation*}$

Here we present few properties of transition relation

Composition of a relation is associative: $r \circ (s \circ t) = (r \circ s ) \circ t$

Composition of a relation is not commutative $r \circ s \neq s \circ r$

Composition of a relation is distributive over union $r \circ(s \cup t) = (r \circ s)\ \cup\ (r \circ t)$

The property that $(r^* \circ s^*)^* = (r \cup s)^*$ is also true about composition of variables.

Programs as relations

Now back to meaning of programs.

Simple programming language

x = exp
if (F) c1 else c2
c1 ; c2
while (F) c1

Consider a program that manipulates two integer variables.

What is the state - map from names to values.

What is the meaning of

x=y+3
x=x+y
x=y+3 ; x=x+y

Observe how the following update commands are translated to binary relations:

$\begin{array}{l} \llbracket X = Y + 3 \rrbracket\ \ r_1 = \{((x_1,y_1)(x_2,y_2)),y_2=y_1 \wedge x_2 = y_1 + 3 \}\\ \llbracket X = X + Y \rrbracket\ \ r_2 = \{((x_1,y_1)(x_2,y_2)),y_2=y_1 \wedge x_2 = x_1 + y_1 \} \end{array}$

If the above update commands appear one after another in a program then by relation composition they can be represented as the one relation:

$\begin{array}{l} \llbracket X = Y + 3, X = X +Y \rrbracket=\\ \end{array}$

$\begin{array}{l} \exists(x_2,y_2) \{((x_1,y_1)(x_2,y_2),(x_3,y_3)),y_2=y_1 \wedge x_2 = y_1 + 3\\ \wedge y_3 = y_2 \wedge x_3 = x_2 + y_2 \} \end{array}$

Programs as formulas

Can we represent relations symbolically?

In general? In our examples?

$r = \{s \mid F(s(x),s(y)) \}$

Accumulation of equalities.

Guarded command language

Our simple language:

x = exp
if (F) c1 else c2
c1 ; c2
while (F) c1

Guarded command language:

assume(F)
c1 [] c2
c1 ; c2
assert(F)
havoc(x)
c1*

Relational semantics of guarded commands.

The following is the interpretation of special commands uses in Guarded Command Language.

$\begin{array}{l} \llbracket assume(F) \rrbracket = \{(s,s)|F(s)\} & \textrm{ success} \\ \end{array}$

$\begin{array}{ll} \llbracket assert(F) \rrbracket = \{(s,s_e)| s \in F(s) \} & \textrm { Jump into Error State }\\ \end{array}$

$\begin{array}{ll} \llbracket C1 [] C2 \rrbracket = [[C1]] U [[C2]] & \textrm{ Non deterministic} \\ \end{array}$

$\begin{array}{ll} \llbracket havoc(x) \rrbracket = \{(s,s1)| \forall "y"\\ "y" \neq "x" , s1("y") = s("y")\} & \textrm { Modify a variable randomly.} \end{array}$

Example of Guarded Command Language

$\begin{array}{lcl} if(F) C_1; \, else \, C_2; & \equiv & (assume(F); C_1)[](assume(\neg F); C_2)\\ while(F)\quad C_1; & \equiv & (assume(F);C_1)^*;assume(\neg F) \end{array}$

Transforming language into guarded commands:

conditionals:

if (F) c1 else c2

⇒

(assume(F); c1)  []
(assume(~F); c2)

loops:

while (F) c1

⇒

(assume(F); c1)*; 
assume(~F)

loops, if we have loop invariants

while [inv I] (F) c1

⇒

assert(I);
havoc x1,...,xn;
assume(I);
(assume(~F) []
 assume(F);
 c1;
 assert I;
 assume(false));

Transformations such as these are used in Jahob, ESC/Java, Spec#.

Weakest Preconditions

Weakest precondition of a set of states is the

$wp(r,P) = \{ s_1 \mid \forall s_2. (s_1,s_2) \in r \rightarrow s_2 \in P \}$

What happens if r is deterministic (a function on states)?

Set of states - formula with state variables for one state.

Relation on states - formula with state variables for two states.

Important property: conjunctivity.

$wp(r, P_1 \cap P_2) =$

Weakest preconditions for guarded commands.

Weakest preconditions of assignments make wp very appealing.