Substitutions for First-Order Logic

(Recall Substitution Theorems for Propositional Logic.)

Motivating Example

It is important to be precise about substitutions in first-order logic. For example, we would like to derive from formula $\forall x.F(x)$ formula $F(t)$ , denoted $subst(\{x \mapsto t\})(F)$ that results from substituting a term $t$ instead of $x$ . For example, from $\forall x. x < x + 5$ we would like to derive $y - 1 < (y - 1) + 5$ . Consider, however formula

$\begin{equation*} \forall x. \exists y. x < y \end{equation*}$

Consider an interpretation in integers. This formula is true in this domain. Now substitute instead of x the term y+1. We obtain

$\begin{equation*} \exists y. y + 1 < y \end{equation*}$

This formula is false. We say that the variable $y$ in term $y+1$ was captured during substitution. When doing substitution in first-order logic we must avoid variable capture. One way to do this is to rename bound variables. Suppose we want to instantiate the formula $\forall x. \exists y. x<y$ with $y+1$ . Then we first rename variables in the formula, obtaining

$\begin{equation*} \forall x_1. \exists y_1. x_1 < y_1 \end{equation*}$

and then after substitution $\{x_1 \mapsto y+1\}$ we obtain $\exists y_1. y + 1 < y_1$ , which is a correct consequence of $\forall x. \exists y. x < y$ .

Naive and Safe Substitutions

Let $U$ be a set of variables (that is, $U \subseteq V$ ). A variable substitution is a function $\sigma : U \to T$ where $T$ is the set of terms first-order logic.

We define naive substitution recursively, first for terms:

$\begin{equation*}\begin{array}{rcl} subst(\sigma)( x ) &=& \sigma( x ),\ \sigma \mbox{ d{}efined at } x \\ subst(\sigma)( x ) &=& x,\ \sigma \mbox{ not d{}efined at } x \\ subst(\sigma)(f(t_1,\ldots,t_n)) &=& f(subst(\sigma)(t_1),\ldots,subst(\sigma)(t_n)) \end{array}\end{equation*}$

and then for formulas:

$\begin{equation*}\begin{array}{rcl} nsubst(\sigma)(R(t_1,\ldots,t_n)) &=& R(nsubst(\sigma)(t_1),\ldots,nsubst(\sigma)(t_n)) \\ nsubst(\sigma)(t_1 = t_2) &=& (nsubst(\sigma)(t_1) = nsubst(\sigma)(t_n)) \\ nsubst(\sigma)(\lnot F) &=& \neg nsubst(\sigma)(F) \\ nsubst(\sigma)(F_1 \land F_2) &=& nsubst(\sigma)(F_1) \wedge nsubst(\sigma)(F_2) \\ nsubst(\sigma)(F_1 \lor F_2) &=& nsubst(\sigma)(F_1) \vee nsubst(\sigma)(F_2) \\ nsubst(\sigma)(\forall x.F) &=& \forall x.\ nsubst(\sigma)(F)\\ nsubst(\sigma)(\exists x.F) &=& \exists x.\ nsubst(\sigma)(F) \end{array} \end{equation*}$

Lemma: Let $\sigma = \{x_1 \mapsto t_1,\ldots,x_n \mapsto t_n\}$ be a variable substitution and $t$ a term. Then for every interpretation $I$ ,

$\begin{equation*} e_T(nsubst(\sigma)(t))(I) = e_T(t)(I[x_1 \mapsto e_T(t_1)(I),\ldots, x_n \mapsto e_T(t_n)(I)]) \end{equation*}$

To avoid variable capture, we introduce in addition to $subst$ a safe substitution, $sfsubst$ .

$\begin{equation*} \begin{array}{rcl} sfsubst(\sigma)(R(t_1,\ldots,t_n)) &=& R(nsubst(\sigma)(t_1),\ldots,nsubst(\sigma)(t_n)) \\ sfsubst(\sigma)(t_1 = t_2) &=& (nsubst(\sigma)(t_1) = nsubst(\sigma)(t_n)) \\ sfsubst(\sigma)(\lnot F) &=& \neg sfsubst(\sigma)(F) \\ sfsubst(\sigma)(F_1 \land F_2) &=& sfsubst(\sigma)(F_1) \wedge sfsubst(\sigma)(F_2) \\ sfsubst(\sigma)(F_1 \lor F_2) &=& sfsubst(\sigma)(F_1) \vee sfsubst(\sigma)(F_2) \\ sfsubst(\sigma)(\forall x.F) &=& \left\{ \begin{array}{ll} \forall x. sfsubst(\sigma)(F) & \text{if} ~ x \notin \text{domain}(\sigma) \wedge x \notin \bigcup_{v \in \text{domain}(\sigma)} FV(v) \\ sfsubst(\sigma)(\forall x^\prime. sfsubst(\{x \mapsto x^\prime \})(F)) & \text{else}} \end{array} \right.\\ sfsubst(\sigma)(\exists x.F) &=& \left\{ \begin{array}{ll} \exists x. sfsubst(\sigma)(F) & \text{if} ~ x \notin \text{domain}(\sigma) \wedge x \notin \bigcup_{v \in \text{domain}(\sigma)} FV(v) \\ sfsubst(\sigma)(\exists x^\prime. sfsubst(\{x \mapsto x^\prime \})(F)) & \text{else}} \end{array} \right. \end{array} \end{equation*}$

Lemma: Let $\sigma = \{x_1 \mapsto t_1,\ldots,x_n \mapsto t_n\}$ be a variable substitution and $t$ a term. Then for every interpretation $I$ ,

$\begin{equation*} e_F(sfsubst(\sigma)(F))(I) = e_F(F)(I[x_1 \mapsto e_T(t_1)(I),\ldots, x_n \mapsto e_T(t_n)(I)]) \end{equation*}$

Lemma: $(\forall x.F) \models sfsubst(\{x \mapsto t\}(F)$ .