Differences

This shows you the differences between two versions of the page.

--- sav08:relational_semantics [2009/05/26 12:54]
vkuncak
+++ sav08:relational_semantics [2015/04/21 17:30]
@@ Line 1: / Line 1: @@
-====== Relational Semantics of a Simple Language ======
-We will next give meaning (semantics) for our [[Simple Programming Language]] using relations between initial and final state of each command.  We illustrate the use of this semantics by showing certain very simple program equivalence properties.
-When we write "x" we mean the variable named x viewed as a syntactic object (string or a node in syntax tree).  We sometimes omit quotes.
-===== Examples of Semantics for Some Programs =====
-In such a semantics, if the state of the program is given by one integer variable //x//, then the meaning of this 'increment program':
-  x = x + 3;
-  x = x - 2
-is the relation $r_1 = \{(0,1),(1,2),(2,3),...,(-1,0),(-2,-1),(-3,-2),...\}$.
-A pair of integers $(v,v')$ is in relation, written $(v,v') \in r_1$, if and only if $v' = v + 1$.  Using set comprehensions, we write
-\[
-    r_1 = \{(v,v') \mid v'=v+1 \}
-\]
-The meaning $r_2$ of
-  x = x + x
-is
-\[
-   r_2 = \{ (v,v') \mid v' = 2v \}
-\]
-The meaning of
-  while (true) { }
-is empty relation.
-=== Why Relations ===
-The meaning is, in general, an arbitrary //relation//.  Therefore:
-  * For certain states there will be no results.  In particular, if a computation starting at a state $s$ does not terminate, then for no $s'$ will there be a state of the form $(s,s')$ in the relation.
-  * For certain states there will be multiple results, when $(s,s_1) \in r$ and $(s,s_2) \in r$ for $s_1 \neq s_2$.  Intuitively, this means command execution starting in $s$ will sometimes compute $s_1$ and sometimes $s_2$.  Verification of such program must account for both possibilities.
-  * Multiple results are important for modelling e.g. concurrency, as well as approximating behavior that we do not know (e.g. what the operating system or environment will do, or what the result of complex computation is)
-===== Program States, Relations on States =====
-Our programs can have any finite number of integer variables.  Let $V$ denote the set of variable names such as 'x' in the example above.  The state of our program is a function $V \to R$ mapping variable names to their values, and the set of states
-\[
-    S = (V \to R)
-\]
-is the set of all such functions.  We assume that the range of variables is $R = \mathbb{Z}$ (the set of integers), but most of what we say will be independent on whether the variables are integers or have some other type.
-In the above increment program, $V = \{"x"\}$ and a state is of the form $\{("x",v)\}$ for some integer $v$ (recall that we represent each function $x$ as relation $\{(x,f(x)\mid x \in D \}$).  So the increment function is given by a relation, call it $r$, that takes the state $\{("x",v)\}$ and maps it to state $\{("x",v+1)\}$:
-\[
-    r = \{(\{("x",v)\}, \{("x",v')\}) \mid v' = v + 1 \}
-\]
-What happens when we have more than one variable?  Let $V = \{"x", "y"\}$ and consider this program:
-  x = y + y
-The meaning of this program is given by relation
-\[
-   r = \{(\{("x",v),("y",u)\},\{("x",v'),("y",u')\}) \mid v'=2u \land u' = u \}
-\]
-Renaming variables so the initial value of variable "x" is denoted $x$ and final value denoted $x'$, we have
-\[
-   r = \{(\{("x",x),("y",y)\},\{("x",x'),("y",y')\}) \mid x'=2y \land y' = y \}
-\]
-I will call the relation //r// the //relational semantics// of this command and the formula $x'=2y \land y'=y$ the //formula semantics// of this command.
-When we have multiple variables, we can give semantics to assignment statements using [[sets and relations#function_update|function update notation]].  Regardless of the number of variables in $V$, the relational semantics of //x=y+y;// is
-\[
-    r = \{(s,s["x" \mapsto 2 s("y")]) \mid s \in S \}
-\]
-===== Semantic Functions =====
-Let $C$ denote the set of all commands (programs).  Our goal is to define semantic function for commands, denoted $r_c$ (**r**elation of the **c**ommand), which maps each command $c_1$ into its relational semantics $r_c(c_1) \subseteq S \times S$.  Formally,
-\[
-   r_c : C \to 2^{S \times S}
-\]
-To define meaning of statements such as //x=y+y;// we need a way to evaluate a term such as //y+y// in a given state.  We introduce function $f_T$ (**f**unction of the **t**erm), which maps a term into a function from states to values:
-\[
-   f_T : T \to (S \to R)
-\]
-===== Meaning of Assignment =====
-Given some definition of $f_T$, the meaning of the assignment statement is simply
-\[
-   r_c(x=e) = \{ (s,s["x" \mapsto f_T(e)(s)]) \mid s \in S \}
-\]
-===== Meaning of Terms =====
-We next define meaning of terms, whose syntax we defined [[simple programming language|earlier]].
-  T ::= K | V | (T + T) | (T - T) | (K * T) | (T / K) | (T % K)
-  F ::= (T=T) | (T < T) | (T > T) | (~F) | (F & F) | (F|F)
-  V ::= x | y | z | ...
-  K ::= 0 | 1 | 2 | ...
-The recursive definition corresponds to a program you would write in e.g. Scala to compute the value of an expression in a given state when writing an interpreter.
-Based on that context-free grammar, we can define recursively the terms semantic :
-\[
-\begin{array}{lcl}
-   f_T(k)(s) & = & k \\
-   f_T(v)(s) & = & s(v) \\
-   f_T(t_1 + t_2)(s) & = & f_T(t_1)(s) + f_T(t_2)(s) \\
-   f_T(t_1 - t_2)(s) & = & f_T(t_1)(s) - f_T(t_2)(s) \\
-   f_T(k * t)(s) & = & k * f_T(t)(s) \\
-   f_T(t / k)(s) & = & f_T(t)(s) / k \\
-   f_T(t \% k)(s) & = & f_T(t)(s) \% k \\
-   f_T(t_1 = t_2)(s) & = & f_T(t_1) = f_T(t_2) \\
-   f_T(t_1 < t_2)(s) & = & f_T(t_1) < f_T(t_2) \\
-   f_T(t_1 > t_2)(s) & = & f_T(t_1) > f_T(t_2) \\
-   f_T(\neg f)(s) & = & \neg f_T(f)(s) \\
-   f_T(f_1 \& f_2)(s) & = & f_T(f_1)(s) \land f_T(f_2) \\
-   f_T(f_1 | f_2)(s) & = & f_T(f_1)(s) \lor f_T(f_2)(s)
-\end{array}
-\]
-===== Sequential Composition as Relation Composition =====
-To execute a statement sequence, we execute the first statement and from the resulting state execute the second statement.  We therefore define statement sequential composition as relation composition of statement meanings:
-\[
-   r_c(c_1\ ;\ c_2) = r_c(c_1) \circ r_c(c_2)
-\]
-==== Example: commuting assignments ====
-Consider the task of a programmer or a compiler who wishes to reorder two assignment statements, converting
-  x = e1;
-  y = e2
-into
-  y = e2;
-  x = e1
-where $x,y \in V$ and $x \neq y$.  Under what conditions will these two program fragments have the same meaning?
-To answer that question, we just have to derive these two expressions :
-\[
-\begin{array}{lll}
-&   & r_c ( x = e_1 ; y = e_2 ) \\
-& = & r_c ( x = e_1 ) \circ r_c ( x = e_2 ) \\
-& = & \{ ( s , s[ "x" \mapsto f_T(e_1)(s) ]) | s \in S \} \circ \{ ( s , s[ "y" \mapsto f_T(e_2)(s) ])~|~s \in S \} \\
-& = & \{ ( s , ( s["x" \mapsto f_T(e_1)(s)] )["y" \mapsto f_T(e_2)( s["x" \mapsto f_T(e_1)(s) ] ) ] ) )~|~s \in S \}
-\end{array}
-\]
-Symetrically, we have:
-\[
-\begin{array}{lll}
-&   & r_c ( y = e_2 ; x = e_1 ) \\
-& = & \{ ( s , ( s["y" \mapsto f_T(e_2)(s)] )["x" \mapsto f_T(e_1)( s["y" \mapsto f_T(e_2)(s) ] ) ] ) )~|~s \in S \}
-\end{array}
-\]
-Thus, if y is free in $e_1$ and x is free in $e_2$, these two fragments will have the same meaning.
-===== Assume Statement =====
-Assume statement is a useful declarative statement, which we will use to define the meaning of several remaining commands.  The meaning of //assume// is simply is the diagonal relation, which produces no resulting states if the given expression is false, and does nothing if the expression is true:
-\[
-    r_c(\mbox{assume}(e)) = \{ (s,s) \mid f_T(e)(s) \} = \Delta_P
-\]
-where $P = \{ s \mid f_T(e)(s) \}$.
-===== Non-deterministic Choice as Union =====
-Non-deterministic choice is another useful declarative statement, which non-deterministically chooses to execute one of the two statements.  We use it to express other statements, as well as to model branching with conditions that we cannot express in the language.  Its meaning is simply union of relations:
-\[
-   r_c(c_1\ \mbox{[]}\ c_2) = r_c(c_1) \cup r_c(c_2)
-\]
-===== Representing if-then-else Using Non-Deterministic Choice and Assume =====
-We will define the semantics of
-  if (F) then c1 else c2
-as the semantics of
-  (assume(F); c1) [] (assume(~F); c2)
-Note that we are using non-deterministic choice, but exactly one of the two statements in the non-determinitic choice will succeed.
-==== Exercise: expanding the definition of conditionals ====
-We can also express this semantics directly as:
-\[
-    r_c(\mbox{if}(F)\mbox{ then } c_1 \mbox{ else } c_2) = \{ (s,s') \mid (f_T(F)(s) \land (s,s') \in r_c(c_1)) \lor (\lnot f_T(F)(s) \land (s,s') \in r_c(c_2) \}
-\]
-===== Non-deterministic Loop as Transitive Closure =====
-Just as we use non-deterministic choice (union) to represent conditional statment, we use non-deterministic loop statement (transitive closure) to represent while loops.  A statement //loop( c )// executes statement //c// any number of times, possibly zero.  Its meaning is transitive closure of the meaning of //c//.
-\[
-   r_c(\mbox{l{}o{}o{}p}(c_1)) = (r_c(c_1))^*
-\]
-===== Expressing While using Non-deterministic Loop =====
-We define semantics of
-  while(F) c
-as the semantics of
-  loop (assume(F);c);
-  assume(~F)
-Why does this definition make sense? It will be union of relations of the form:
-  (assume(F); c)^n
-  assume(~F)
-for all n.  But if the value of n is wrong the result will be empty relation:
-  * if it executes even if F is false, then assume(F) will give empty relation
-  * if it stops even if F is true, then assume(~F) at the end will give empty relation
-Therefore, the result is equal to the correct number of execution times.
-===== Summary of Simplifying the Language =====
-Instead of
-  c ::=  x=T | (if (F) c else c) | c ; c | (while (F) c)
-we can have language
-  c ::=  x=T | assume(F) |  c [] c  |  c ; c | loop c
-Applying it to a multiplication example, we obtain
-<code>
-r = 0;
-i = y;
-loop (
-  assume (i > 0);
-  r = r + x;
-  i = i - 1
-);
-assume (i <= 0);
-</code>
-===== Havoc Statement =====
-Havoc statement is another useful declarative statement.  It changes a given variable entirely arbitrarily: there will be one possible state for each possible value of integer variable.
-\[
-    r_c(\mbox{havoc}(x)) = \{ (s,s') \mid \forall v \in V. v \neq "x" \rightarrow s(v)=s'(v) \}
-\]
-==== Expressing Assignment with Havoc+Assume ====
-We can prove that the following equality holds under certain conditions:
-\[
-    r_c(\mbox{havoc}(x);\mbox{assume}(x=e)) = r_c(x=e)
-\]
-In other words, assigning a variable is the same as changing it arbitrarily and then assuming that it has the right value.  Under what condition does this equality hold?
-===== Summary of Correspondence between Programs and Relations =====
-# program construct # relational meaning #
-| sequential composition | relation composition |
-| branching | union |
-| loops | transitive closure |
-| assignment | update of one state component |
-| havoc | arbitrary change of one state component |
-===== Control-Flow Graphs, While Theorem: One Loop is Enough  =====
-  * from programs to control-flow graphs: transformation rules
-  * interpreter running over control flow graph
-  * normal form using only one loop and one extra variable
-Sum of all matrix elements as control-flow graph and as one loop.
-===== Proving Program Properties =====
-A general approach:
-  * compute semantics of a program as a mathematical object (e.g. relation)
-  * prove that the relation satisfies the desired property
-==== Example ====
-Consider the code
-<code>
-r = 0;
-i = y;
-while (i > 0) (
-  r = r + x;
-  i = i - 1
-)
-</code>
-We represent it as
-<code>
-r = 0;
-i = y;
-loop (
-  assume (i > 0);
-  r = r + x;
-  i = i - 1
-);
-assume (i <= 0);
-</code>
-We compute the meaning of
-  assume (i > 0);
-  r = r + x;
-  i = i - 1
-as the relation $b$ whose formula is
-\[
-    i > 0 \land r' = r + x \land i' = i - 1
-\]
-We then wish to compute the transitive closure $b^* = \bigcup_{n \ge 0} b^n$.
-We claim that for every $n \ge 0$ relation $b^n$ is given by formula
-\[
-    i \ge n \land r' = r + n x \land i' = i - n
-\]
-We can prove this claim by induction.  The union of these relations is then given by
-\[
-     \exists n \ge 0. i \ge n \land r' = r + n x \land i' = i - n
-\]
-From there we can derive the meaning of
-<code>
-r = 0;
-i = y;
-loop (
-  assume (i > 0);
-  r = r + x;
-  i = i - 1
-)
-</code>
-as
-\[
-     \exists n \ge 0. y \ge n \land r' = n x \land i' = y - n
-\]
-which when composed with //assume//($\neg (i > 0)$) = //assume//($i \leq 0$) gives
-\[
-     \exists n \ge 0. y \ge n \land r' = n x \land i' = y - n \land i' \leq 0
-\]
-which is equivalent to
-\[
-    y \ge 0 \land i' = 0 \land r' = y x
-\]
-This is the semantics of the above code.
-===== Relations as Specifications =====
-Nondeterministic programs can be used to specify other programs
-If $r$ is relation representing program and $s$ relation representing specification, we say that program meets specification iff
-\[
-    r \subseteq s
-\]
-Example:
-This allows us to prove that programs satisfy their //contracts//, expressed using preconditions and postconditions:
-  void f()
-    requires x > 0
-    ensures x > 0
-  {
-    x = x + 1;
-  }
-What is the relation for contract with
-  * precondition $P(x)$
-  * postcondition $Q(x)$
-Answer: $\{(x,x')\ \mid\ P(x) \rightarrow Q(x')\}$
-The correctness condition:
-\[
-    \{ (v,v').\ v' = v + 1 \} \subseteq \{ (v,v').\ v > 0 \rightarrow v' > 0 \}
-\]
-reduces to:
-\[
-   \forall v. \forall v'.\  v' = v + 1 \rightarrow ( v > 0 \rightarrow v' > 0)
-\]
-===== Further reading =====
-  * C A R Hoare and He Jifeng. Unifying Theories of Programming. Prentice Hall, 1998
-  * Semantics-based Program Analysis via Symbolic Composition of Transfer Relations, [[http://www.cs.cmu.edu/afs/cs.cmu.edu/project/fox/mosaic/papers/acid-thesis.ps|PhD dissertation by Christopher Colby]], 1996