LARA

This is an old revision of the document!


Complexity of the verification process

Preliminary remarks

Terminology

  • We say that two patterns have the same signature if, without their (optional) guard, they are identical.
  • The set of pattern is assumed to be the cartesian product representing the set of inputs that a pattern would match.

Number of different positions

The maximum number of different positions $max_p(E)$ which can be found in a given pattern matching expression $E$ is naturally linear in terms of the size of the source code (whether you want to measure this as tokens, elements of the AST or characters is irrelevant).

Computing the set corresponding to a pattern can be done in linear time

…in terms of the size of the source code, that is. This should be obvious…

Consistency of the dimensions of the cartesian products

We know that to every pattern corresponds one cartesian product. By construction, the dimension of these products is always the same since it corresponds to $max_p(E)$. In particular, this means that taking the union of two “pattern sets” will yield yet another cartesian product with the same cardinality.

Note that we now also have:

\[ (A_1 \times A_2 \times \ldots \times A_n) \cap (B_1 \times B_2 \times \ldots \times B_n) = \emptyset \iff (A_1 \cap B_1 = \emptyset ) \vee (A_2 \cap B_2 = \emptyset ) \vee \ldots \vee (A_n \cap B_n = \emptyset ) \]

The proof is straightforward, but still requires that the dimensions match, otherwise the intersection could be empty for this other reason.

Translation of the assumptions/axioms

We will always use unary relations to represent sets (and therefore set membership): $x \in A$ will be translated to $A(x)$, and $x \notin A$ to $\neg A(x)$.

For each statement which we want to prove or disprove, we make use of the user-provided informations about properties on the class hierarchy and the extractors.

These properties are always in one of the following forms:

  • Set of extractors covering a given type: $E_1 \cup \ldots \cup E_n \supseteq T$
  • Sealed and abstract classes: $T = S_1 \cup \ldots \cup S_n$
  • Disjointness of extractors: $E \cap F = \emptyset$

This translates to:

  • $\forall x . E_1(x) \vee \ldots \vee E_n(x) \Rightarrow T(x)$
  • $\forall x . S_1(x) \vee \ldots \vee S_n(x) \Leftrightarrow T(x)$
  • and $\forall x . \neg (E(x) \wedge F(x))$ respectively.

Verifying disjunction

Recall that disjunction is verified for each pair of patterns. There are two options:

  1. Either the patterns have different signatures, in which case we ignore their guards and verify that the interesection of their sets is empty…
  2. …or they have the same signature, in which case we simply verify that the conjunction of their guards is false.

Since we limit our boolean expressions to QFBAPA (FIXME do we?), the formula for verifying the disjointness of two guards $g_1$ and $g_2$, namely $\neg(g_1 \wedge g_2)$ is itself in QFBAPA and its satisfiability can hence be checked with an algorithm in NP [1]. This solves the second case.

The first case is not much harder. We can use the abovementioned equivalence for the intersection of cartesian products of same dimensions, and we need only one quantifier to transform it to the desired form:

  • $(A_1 \times A_2 \times \ldots \times A_n) \cap (B_1 \times B_2 \times \ldots \times B_n) = \emptyset$

…is translated to:

  • $\forall x . \neg(A_1(x) \wedge B_1(x)) \vee \ldots \vee \neg(A_n(x) \wedge B_n(x))$

Note that checking the satifiability should be doable in polynomial time, as all needs to be done is find one pair of sets among a finite list which are disjoint (and this information can only come from the assumptions/axioms, which are themselves in a finite number. (FIXME : I know this doesn't sound convincing.. let's do better).

[1] Quantifier Free Boolean Algebra with Presburger Arithmetic is NP-Complete, technical report somewhere IIRC.