LARA

Congruence Closure Algorithm and Correctness

Congruences on all Ground Terms

Let $E$ be a set of ground equalities and disequalities representing the formula

\begin{equation*}
   E := \bigwedge_{i = 0}^m t_i = t_i^\prime \wedge \bigwedge_{i = m+1}^n t_i \neq t_i^\prime
\end{equation*}

Then the following are equivalent:

  • $E$ is satisfiable
  • there exists a congruence on ground terms in which $E$ is true
  • $E$ is true in the least congruence $r_*$ containing $r_0 = \{(t_1,t_2) \mid (t_1=t_2) \in E \}$

Congruences on Ground Terms Occurring in the Formula

Instead of a congruence on the (typically infinite) set of all ground terms, we compute congruences on the set $T$ of the ground terms ocurring in $E$.

Note that $r_0 \subseteq T^2$ and that applying reflexivity, symmetry, transitivity keeps the relation in $T^2$.

We need modify congruence condition to apply only to terms in $T^2$, using the following congruence condition:

\begin{equation*}
\begin{array}{l}
   \forall x_1,\ldots,x_n,y_1,\ldots,y_n. \bigwedge_{i=1}^n (x_i,y_i) \in r\ \land \ f(x_1,\ldots,x_n) \in T \land f(y_1,\ldots,y_n) \in T  \rightarrow \\ 
    (f(x_1,\ldots,x_n),f(y_1,\ldots,y_n)) \in r
\end{equation*}

Algorith:

  • start with $r_0$
  • apply the congruence rule above as well as reflexivity, symmetry and transitivity until the fixpoint
  • denote result by $CC(r_0)$.

Theorem: conjunction of formulas in $E$ is satisfiable iff for each $t\neq t' \in E$ we have $(t,t') \notin CC(r_0)$.

Proof:

  • for each pair $(t,t') \in CC(r,r_0)$ we have that $t=t'$ is a semantic consequence of $E$
  • we use $CC(r_0)$ to construct a structure in which $E$ holds
    • take as domain the equivalence classes of $CC(r_0)$; let $[t]$ denote the equivalence class of $t$
    • define $f([t_1],\ldots,[t_n])$ as $[f(t_1,\ldots,t_n)]$ if $f(t_1,\ldots,t_n) \in T$, and arbitrary otherwise

Implementing Congruence Closure

Congruence is an equivalence relation, so the algorithm is a special case of union-find algorithms for finding partitions.

We represent expressions as directed acyclic graphs, which are syntax trees for expressions, with sharing. This data structure is called E-graph.

Within this structure we have union of inverted trees, representing union-find data structure.

Inverted tree representing equivalence classes.

Example: Let $f^k(a)$ denote $f(\ldots f(a) \ldots)$ with $k$-fold application of $f$. Consider

\begin{equation*}
   f^3(a) = a \land f^5(a)=a \land f^2(a)\neq a
\end{equation*}

Apply the congruence closure algorithm to check its satisfiability.