
This shows you the differences between two versions of the page.

Link to this comparison view

sav08:relational_semantics [2009/03/04 16:08]
sav08:relational_semantics [2015/04/21 17:30]
Line 1: Line 1:
-====== Relational Semantics of a Simple Language ====== 
-We will next give meaning (semantics) for our [[Simple Programming Language]] using relations between initial and final state of each command. ​ We illustrate the use of this semantics by showing certain very simple program equivalence properties. 
-When we write "​x"​ we mean the variable named x viewed as a syntactic object (string or a node in syntax tree). ​ We sometimes omit quotes. 
-===== Examples of Semantics for Some Programs ===== 
-In such a semantics, if the state of the program is given by one integer variable //x//, then the meaning of this '​increment program':​ 
-  x = x + 3; 
-  x = x - 2 
-is the relation $r_1 = \{(0,​1),​(1,​2),​(2,​3),​...,​(-1,​0),​(-2,​-1),​(-3,​-2),​...\}$.  ​ 
-A pair of integers $(v,​v'​)$ is in relation, written $(v,​v'​) \in r_1$, if and only if $v' = v + 1$.  Using set comprehensions,​ we write  
-    r_1 = \{(v,​v'​) \mid v'=v+1 \} 
-The meaning $r_2$ of 
-  x = x + x 
-   r_2 = \{ (v,v') \mid v' = 2v \} 
-The meaning of 
-  while (true) { } 
-is empty relation. 
-=== Why Relations === 
-The meaning is, in general, an arbitrary //​relation//​. ​ Therefore: 
-  * For certain states there will be no results. ​ In particular, if a computation starting at a state $s$ does not terminate, then for no $s'$ will there be a state of the form $(s,​s'​)$ in the relation. 
-  * For certain states there will be multiple results, when $(s,s_1) \in r$ and $(s,s_2) \in r$ for $s_1 \neq s_2$.  Intuitively,​ this means command execution starting in $s$ will sometimes compute $s_1$ and sometimes $s_2$. ​ Verification of such program must account for both possibilities. 
-  * Multiple results are important for modelling e.g. concurrency,​ as well as approximating behavior that we do not know (e.g. what the operating system or environment will do, or what the result of complex computation is) 
-===== Program States, Relations on States ===== 
-Our programs can have any finite number of integer variables. ​ Let $V$ denote the set of variable names such as '​x'​ in the example above. ​ The state of our program is a function $V \to R$ mapping variable names to their values, and the set of states 
-    S = (V \to R) 
-is the set of all such functions. ​ We assume that the range of variables is $R = \mathbb{Z}$ (the set of integers), but most of what we say will be independent on whether the variables are integers or have some other type. 
-In the above increment program, $V = \{"​x"​\}$ and a state is of the form $\{("​x",​v)\}$ for some integer $v$ (recall that we represent each function $x$ as relation $\{(x,​f(x)\mid x \in D \}$).  So the increment function is given by a relation, call it $r$, that takes the state $\{("​x",​v)\}$ and maps it to state $\{("​x",​v+1)\}$:​ 
-    r = \{(\{("​x",​v)\},​ \{("​x",​v'​)\}) \mid v' = v + 1 \} 
-What happens when we have more than one variable? ​ Let $V = \{"​x",​ "​y"​\}$ and consider this program: 
-  x = y + y 
-The meaning of this program is given by relation 
-   r = \{(\{("​x",​v),​("​y",​u)\},​\{("​x",​v'​),​("​y",​u'​)\}) \mid v'=2u \land u' = u \} 
-Renaming variables so the initial value of variable "​x"​ is denoted $x$ and final value denoted $x'$, we have 
-   r = \{(\{("​x",​x),​("​y",​y)\},​\{("​x",​x'​),​("​y",​y'​)\}) \mid x'=2y \land y' = y \} 
-I will call the relation //r// the //​relational semantics// of this command and the formula $x'=2y \land y'=y$ the //formula semantics// of this command. 
-When we have multiple variables, we can give semantics to assignment statements using [[sets and relations#​function_update|function update notation]]. ​ Regardless of the number of variables in $V$, the relational semantics of //x=y+y;// is 
-    r = \{(s,​s["​x"​ \mapsto 2 s("​y"​)]) \mid s \in S \} 
-===== Semantic Functions ===== 
-Let $C$ denote the set of all commands (programs). ​ Our goal is to define semantic function for commands, denoted $r_c$ (**r**elation of the **c**ommand),​ which maps each command $c_1$ into its relational semantics $r_c(c_1) \subseteq S \times S$.  Formally, 
-   r_c : C \to 2^{S \times S} 
-To define meaning of statements such as //x=y+y;// we need a way to evaluate a term such as //y+y// in a given state. ​ We introduce function $f_T$ (**f**unction of the **t**erm), which maps a term into a function from states to values: 
-   f_T : T \to (S \to R) 
-===== Meaning of Assignment ===== 
-Given some definition of $f_T$, the meaning of the assignment statement is simply 
-   ​r_c(x=e) = \{ (s,​s["​x"​ \mapsto f_T(e)(s)]) \mid s \in S \} 
-===== Meaning of Terms ===== 
-We next define meaning of terms, whose syntax we defined [[simple programming language|earlier]].  ​ 
-  T ::= K | V | (T + T) | (T - T) | (K * T) | (T / K) | (T % K) 
-  F ::= (T=T) | (T < T) | (T > T) | (~F) | (F & F) | (F|F) 
-  V ::= x | y | z | ... 
-  K ::= 0 | 1 | 2 | ... 
-The recursive definition corresponds to a program you would write in e.g. Scala to compute the value of an expression in a given state when writing an interpreter. 
-Based on that context-free grammar, we can define recursively the terms semantic : 
-   ​f_T(k)(s) & = & k \\ 
-   ​f_T(v)(s) & = & s(v) \\ 
-   ​f_T(t_1 + t_2)(s) & = & f_T(t_1)(s) + f_T(t_2)(s) \\ 
-   ​f_T(t_1 - t_2)(s) & = & f_T(t_1)(s) - f_T(t_2)(s) \\ 
-   f_T(k * t)(s) & = & k * f_T(t)(s) \\ 
-   f_T(t / k)(s) & = & f_T(t)(s) / k \\ 
-   f_T(t \% k)(s) & = & f_T(t)(s) \% k \\ 
-   ​f_T(t_1 = t_2)(s) & = & f_T(t_1) = f_T(t_2) \\ 
-   ​f_T(t_1 < t_2)(s) & = & f_T(t_1) < f_T(t_2) \\ 
-   ​f_T(t_1 > t_2)(s) & = & f_T(t_1) > f_T(t_2) \\ 
-   ​f_T(\neg f)(s) & = & \neg f_T(f)(s) \\ 
-   ​f_T(f_1 \& f_2)(s) & = & f_T(f_1)(s) \land f_T(f_2) \\ 
-   ​f_T(f_1 | f_2)(s) & = & f_T(f_1)(s) \lor f_T(f_2)(s) ​ 
-===== Sequential Composition as Relation Composition ===== 
-To execute a statement sequence, we execute the first statement and from the resulting state execute the second statement. ​ We therefore define statement sequential composition as relation composition of statement meanings: 
-   ​r_c(c_1\ ;\ c_2) = r_c(c_1) \circ r_c(c_2) 
-==== Example: commuting assignments ====  
-Consider the task of a programmer or a compiler who wishes to reorder two assignment statements, converting 
-  x = e1; 
-  y = e2 
-  y = e2; 
-  x = e1 
-where $x,y \in V$ and $x \neq y$.  Under what conditions will these two program fragments have the same meaning? 
-To answer that question, we just have to derive these two expressions : 
-&   & r_c ( x = e_1 ; y = e_2 ) \\ 
-& = & r_c ( x = e_1 ) \circ r_c ( x = e_2 ) \\ 
-& = & \{ ( s , s[ "​x"​ \mapsto f_T(e_1)(s) ]) | s \in S \} \circ \{ ( s , s[ "​y"​ \mapsto f_T(e_2)(s) ])~|~s \in S \} \\ 
-& = & \{ ( s , ( s["​x"​ \mapsto f_T(e_1)(s)] )["​y"​ \mapsto f_T(e_2)( s["​x"​ \mapsto f_T(e_1)(s) ] ) ] ) )~|~s \in S \}  
-Symetrically,​ we have: 
-&   & r_c ( y = e_2 ; x = e_1 ) \\ 
-& = & \{ ( s , ( s["​y"​ \mapsto f_T(e_2)(s)] )["​x"​ \mapsto f_T(e_1)( s["​y"​ \mapsto f_T(e_2)(s) ] ) ] ) )~|~s \in S \}  
-Thus, if y is free in $e_1$ and x is free in $e_2$, these two fragments will have the same meaning. 
-===== Assume Statement ===== 
-Assume statement is a useful declarative statement, which we will use to define the meaning of several remaining commands. ​ The meaning of //assume// is simply is the diagonal relation, which produces no resulting states if the given expression is false, and does nothing if the expression is true: 
-    r_c(\mbox{assume}(e)) = \{ (s,s) \mid f_T(e)(s) \} = \Delta_P 
-where $P = \{ s \mid f_T(e)(s) \}$. 
-===== Non-deterministic Choice as Union ===== 
-Non-deterministic choice is another useful declarative statement, which non-deterministically chooses to execute one of the two statements. ​ We use it to express other statements, as well as to model branching with conditions that we cannot express in the language. ​ Its meaning is simply union of relations: 
-   ​r_c(c_1\ \mbox{[]}\ c_2) = r_c(c_1) \cup r_c(c_2) 
-===== Representing if-then-else Using Non-Deterministic Choice and Assume ===== 
-We will define the semantics of 
-  if (F) then c1 else c2 
-as the semantics of 
-  (assume(F); c1) [] (assume(~F);​ c2) 
-Note that we are using non-deterministic choice, but exactly one of the two statements in the non-determinitic choice will succeed. 
-==== Exercise: expanding the definition of conditionals ====  
-We can also express this semantics directly as: 
-    r_c(\mbox{if}(F)\mbox{ then } c_1 \mbox{ else } c_2) = \{ (s,s') \mid (f_T(F)(s) \land (s,s') \in r_c(c_1)) \lor (\lnot f_T(F)(s) \land (s,s') \in r_c(c_2) \} 
-===== Non-deterministic Loop as Transitive Closure ===== 
-Just as we use non-deterministic choice (union) to represent conditional statment, we use non-deterministic loop statement (transitive closure) to represent while loops. ​ A statement //loop( c )// executes statement //c// any number of times, possibly zero.  Its meaning is transitive closure of the meaning of //c//. 
-   ​r_c(\mbox{l{}o{}o{}p}(c_1)) = (r_c(c_1))^* 
-===== Expressing While using Non-deterministic Loop ===== 
-We define semantics of 
-  while(F) c 
-as the semantics of 
-  loop (assume(F);​c);​ 
-  assume(~F) 
-Why does this definition make sense? It will be union of relations of the form: 
-  (assume(F); c)^n 
-  assume(~F) 
-for all n.  But if the value of n is wrong the result will be empty relation: 
-  * if it executes even if F is false, then assume(F) will give empty relation 
-  * if it stops even if F is true, then assume(~F) at the end will give empty relation 
-Therefore, the result is equal to the correct number of execution times. 
-===== Summary of Simplifying the Language ===== 
-Instead of 
-  c ::=  x=T | (if (F) c else c) | c ; c | (while (F) c) 
-we can have language 
-  c ::=  x=T | assume(F) |  c [] c  |  c ; c | loop c 
-Applying it to [[simple programming language|Collatz example]] we obtain 
-r = 0; 
-i = y; 
-loop ( 
-  assume (i > 0); 
-  r = r + x; 
-  i = i - 1 
-assume (i <= 0); 
-===== Havoc Statement ===== 
-Havoc statement is another useful declarative statement. ​ It changes a given variable entirely arbitrarily:​ there will be one possible state for each possible value of integer variable. 
-    r_c(\mbox{havoc}(x)) = \{ (s,s') \mid \forall v \in V. v \neq "​x"​ \rightarrow s(v)=s'​(v) \} 
-==== Expressing Assignment with Havoc+Assume ==== 
-We can prove that the following equality holds under certain conditions: 
-    r_c(\mbox{havoc}(x);​\mbox{assume}(x=e)) = r_c(x=e) 
-In other words, assigning a variable is the same as changing it arbitrarily and then assuming that it has the right value. ​ Under what condition does this equality hold? 
-===== Control-Flow Graphs, While Theorem: One Loop is Enough ​ ===== 
-  * from programs to control-flow graphs: transformation rules 
-  * interpreter running over control flow graph 
-  * normal form using only one loop and one extra variable 
-Sum of all matrix elements as control-flow graph and as one loop. 
-===== Proving Program Properties ===== 
-A general approach: 
-  * compute semantics of a program as a mathematical object (e.g. relation) 
-  * prove that the relation satisfies the desired property 
-==== Example ==== 
-Consider the code 
-r = 0; 
-i = y; 
-while (i > 0) ( 
-  r = r + x; 
-  i = i - 1 
-We represent it as  
-r = 0; 
-i = y; 
-loop ( 
-  assume (i > 0); 
-  r = r + x; 
-  i = i - 1 
-assume (i <= 0); 
-We compute the meaning of 
-  assume (i > 0); 
-  r = r + x; 
-  i = i - 1 
-as the relation $b$ whose formula is 
-    i > 0 \land r' = r + x \land i' = i - 1 
-We then wish to compute the transitive closure $b^* = \bigcup_{n \ge 0} b^n$. 
-We claim that for every $n \ge 0$ relation $b^n$ is given by formula 
-    i \ge n \land r' = r + n x \land i' = i - n 
-We can prove this claim by induction. ​ The union of these relations is then given by  
-     ​\exists n \ge 0. i \ge n \land r' = r + n x \land i' = i - n 
-From there we can derive the meaning of  
-r = 0; 
-i = y; 
-loop ( 
-  assume (i > 0); 
-  r = r + x; 
-  i = i - 1 
-     ​\exists n \ge 0. y \ge n \land r' = n x \land i' = y - n 
-which when composed with //​assume//​($\neg (i > 0)$) = //​assume//​($i \leq 0$) gives 
-     ​\exists n \ge 0. y \ge n \land r' = n x \land i' = y - n \land i' \leq 0  
-which is equivalent to 
-    y \ge 0 \land i' = 0 \land r' = y x 
-This is the semantics of the above code. 
-===== Relations as Specifications ===== 
-Nondeterministic programs can be used to specify other programs 
-If $r$ is relation representing program and $s$ relation representing specification,​ we say that program meets specification iff 
-    r \subseteq s 
-This allows us to prove that programs satisfy their //​contracts//,​ expressed using preconditions and postconditions:​ 
-  void f() 
-    requires x > 0 
-    ensures x > 0 
-  { 
-    x = x + 1; 
-  } 
-What is the relation for contract with  
-  * precondition $P(x)$ 
-  * postcondition $Q(x)$ 
-Answer: $\{(x,​x'​)\ \mid\ P(x) \rightarrow Q(x'​)\}$ 
-The correctness condition: 
-    \{ (v,​v'​).\ v' = v + 1 \} \subseteq \{ (v,​v'​).\ v > 0 \rightarrow v' > 0 \} 
-reduces to: 
-   ​\forall v. \forall v'​.\ ​ v' = v + 1 \rightarrow ( v > 0 \rightarrow v' > 0) 
-===== Further reading ===== 
-  * C A R Hoare and He Jifeng. Unifying Theories of Programming. Prentice Hall, 1998 
-  * Semantics-based Program Analysis via Symbolic Composition of Transfer Relations, [[http://​​afs/​​project/​fox/​mosaic/​papers/​|PhD dissertation by Christopher Colby]], 1996