LARA – Lab for Automated Reasoning and Analysis -

This is an old revision of the document!

* why the problem is interesting

deciding WS1S - done through automata construction (mona)
main problem: quantifier elimination
is there another way of QE? - can we characterize relations in WS1S in some other way?
WS1S too big: look at a smaller subset of the language
our L: characterize unary relations, binary relations,…n-ary

* what was done before

p-recognizability

* what the main idea is

* what you did (precise description)

* the most surprising/interesting things you found

* limitations

* future directions

look at other subsets of WS1S, whole of WS1S

START

$N = \{0,1,2,\ldots\}$

Recall the syntax for WS1S:

$\begin{equation*} F ::= v \subseteq v \mid s(v,v) \mid F \lor F \mid \lnot F \mid \exists v.F \end{equation*}$

The language $L^R$ is the language of WS1S with the restriction that there are no free second-order (set) variables. A w-relation is a relation on first-order variables expressible in $L^R$ .

The W-Problem

Characterize the set of w-relations expressible in $L^R$ .

We want to characterize the set by proving that a w-relation is recognized by an automaton (with parallel input) that accepts strings of the form ABC where the length of B is completely described by a union of a finite number of sets S(a,b) for some finite number of tuples (a,b) for given constants a and b.

Here, $S(a,b) = \{ a + k b, k \in \mathbb{N} \}$ .

Terminology clarification

Define a set $S$ to be ultimately periodic iff its characteristic function $x \mapsto \mbox{if}\ x\in S\ \mbox{then} \ 1\ \mbox{else}\ 0$ is ultimately periodic.

A sequence $s : N \to \{0,1\}$ is ultimately periodic iff there exists $n_0 \in N$ and $v \in N$ such that $\forall n \geq n_0.\ s_n = s_{n+v}$ .

Let $S(a,b) = \{ a + k b, k \in \mathbb{N} \}$ .

Observe that a set is ultimately periodic iff it is a union of a finite number of sets of the form $S(a,b)$ .

proof

Consider an ultimately periodic set $S$ with some $v$ and $n_0$ . Consider the set $I \subseteq \{1,\ldots,n_0-1\}$ of numbers $i < n_0$ belonging to $S$ , and the set $J \subseteq \{n_0,\ldots,n_0+v-1\}$ of numbers $j$ between $n_0$ and $n_0+v-1$ belonging to $S$ . Then $S=\bigcup_{i \in I}\{i+0k\ |\ k\in \mathbb{N}\} \cup \bigcup_{j \in J}\{j+vk\ |\ k\in \mathbb{N}\}.$

We prove the other side of the equivalence by induction.

Base Case: Clearly $S(a,b)$ is ultimately periodic ( $n_0 = a$ , $v=b$ ).
Induction: The union of an ultimately periodic set ( $n_{0old}$ , $v_{old}$ ) and a set of the form $S(a,b)$ is an ultimately periodic set with $n_0 = \max(n_{0old},a)$ and $v=b\times v_{old}.$ TO EXPLAIN WITH MORE DETAIL IF NOT OBVIOUS

Little Problem

Let L be a regular language over an alphabet $\Sigma$ and

$M = \{ |w|, w \in L \}$

be the set of lengths of words in L. Prove that M is a union of a finite number of sets S(a,b) for some finite number of tuples (a,b). Here, $S(a,b) = \{ a + k b, k \in \mathbb{N} \}$ .

Proof

We prove the claim by induction on the structure of the regular expression representing the language.
Notation: We say that a set has property F if it is a union of finitely many sets S(a,b) for some tuples (a,b). We say a language L (or the corresponding regular expression) has property P if its set of lengths M has property F.

Hence, we need to prove that any regular language has property P.

Recall that Regular Expressions over an alphabet $\alpha^*$ are all strings over the alphabet $\{\Sigma \cup \{(,),\phi, \cup, *\}\}$ that can be obtained as follows:

(1) $\phi$ and each member of $\alpha$ is a regular expression.
(2) If $\alpha$ and $\beta$ are regular expressions, then so is their concatenation: $(\alpha\beta)$ .
(3) If $\alpha$ and $\beta$ are regular expressions, then so is their union: $(\alpha \cup \beta)$ .
(4) If $\alpha$ is a regular expression, then so is its Kleene closure: $\alpha*$ .
(5) Nothing is a regular expression unless it follows from (1) through (4).

A Regular Language, $L$ is generated as follows:

(1) $L(\phi) = \phi$ and $L(a) = a$ for each $a \in \Sigma$ .
(2) If $\alpha$ and $\beta$ are regular expressions, then $L((\alpha\beta)) = L(\alpha)L(\beta)$ .
(3) If $\alpha$ and $\beta$ are regular expressions, then $L((\alpha \cup \beta)) = L(\alpha) \cup (\beta)$ .
(4) If $\alpha$ is a regular expression, then $L((\alpha)*) = L(\alpha)*$ .

Base Case:
The length of a single character in $\Sigma$ is 1. Thus the length of any word corresponding to a single character belongs to the set $\{1+0.k, k \in \mathbb{N}\}$ . Hence any such word has property P.

To prove the inductive step, we need the following lemma:

Lemma 1 The set $S=\{a+bk_1+ck_2\}$ where $k_1, k_2 \in N$ satisfies property F, that is, $S$ is equal to a union of finitely many sets of the form $\{\alpha + \beta k\}$ , $k \in N$ .

Proof of Lemma 1

Consider the following two (mutually disjoint and exhaustive) cases:

For $b,c,d \in \mathbb{N}$ ,

(1) gcd(b,c)=1
(2) gcd(b,c)=d, $d\neq 1$ ,

Case (1): gcd(b,c)=1 Consider the numbers contained in the set $\{bk_1 + ck_2\}.$ Any such number is congruent to $ck_2 \pmod{b}.$ Letting $k_2$ range over $0,\ldots,b-1\}$ , we obtain for each value of $k_2$ a distinct congruence class $\pmod{b}$ : $[0], [c], 2c],\ldots,[(b-1)c].$

The congruence classes are distinct by the following argument: assume there exist $j$ and $l$ such that $0\leq j,l<b$ and $jc=lc \pmod{b}.$ Then $jc+m b = lc + n b$ for some $m,n.$ Hence $(l-j)c = (m-n)b.$ Since b divides $(n-m)b,$ it must divide $(l-j)c.$ But gcd(b,c)=1 hence b must divide $l-j$ . But $l-j<b$ , hence we must have $l-j=0.$

Since there are b distinct congruence classes $\pmod{b},$ the set $\{bk_1 + ck_2\}$ contains all natural numbers beyond $n_0 = (b-1)(c-1)$ . This is because of the following theorem from Elementary Number Theory: If gcd(b,c)=1 and n >= (b-1)(c-1). Then bx+cy=n has a non-negative solution, that is, one in which both x and y are non-negative integers. There are also finitely many numbers below $n_0$ that are contained in the set $\{bk_1 + ck_2\}$ (For example, $0$ is in the set).

Hence, the set $\{a + bk_1 + ck_2\}$ is equal to the union of a finite number of constants (less than $a + n_0$ ) and the set $\{a + n_0 + k\}$ . Each constant $c$ can be represented in the set $\{c + 0.k, k\in \mathbb{N}\}$ . In other words, if gcd(b,c) = 1, then the set $S=\{a+bk_1+ck_2\}$ where $k_1, k_2 \in N$ satisfies property F.

Case(2): gcd(b,c)=d, $d\neq 1$ Any number of the form $bk_1 + ck_2$ can be written as $d(mk_1 + nk_2)$ where $b = md$ and $c = nd$ and $gcd(m,n) = 1$ . From (1), we know that the set $T = (mk_1 + nk_2)$ satisfies property F. The set $d(mk_1 + nk_2)$ can be obtained from $T$ by multiplying the constant factors (a and b) in each set of $T$ by $d$ and so, satisfies property F.

Hence, if gcd(b,c) = d , then the set $S=\{a+bk_1+ck_2\}$ where $k_1, k_2 \in N$ satisfies property F.

$\Box$

Inductive step:
We prove that the operations union, concatenation and Kleene closure of two regular expressions satisfying property P yield a regular expression satisfying property P.

Union: Let $r_1$ and $r_2$ be two regular expressions satisfying property P. Then, their corresponding sets of lengths $M_1$ and $M_2$ satisfy property F. The set of lengths $M$ of $r=r_1 \cup r_2$ is simply $M=M_1 \cup M_2$ . Hence $r$ satisfies property P.

Concatenation: Let $r_1$ and $r_2$ be two regular expressions satisfying property P. Then, their corresponding sets of lengths $M_1$ and $M_2$ satisfy property F. The set of lengths of $r=r_1r_2$ is the finite set $Q = \{t_1 + t_2, t_1 \in M_1 \mbox, t_2 \in M_2\}.$ This set contains elements of the form $a_1 + bk_1 + a_2 + ck_2= a + bk_1 + ck_2.$ By Lemma 1, $\{a+bk_1+ck_2\}$ satisfies property F. Set Q therefore, also satisfies property F. Thus, $r$ satisfies property P.

Kleene Closure: Let $r$ be a regular expression satisfying property P. Let the corresponding set of lengths be $M$ that satisfies property F. $M = \bigcup_{n} m_i$ where $m_i$ is a set of the form S(a,b) and $n$ is finite. The regular expression $r*$ can contain any number of repetitions (including zero) of any of the sub-regular expressions. Hence, the possible lengths of words in $r*$ is given by $M* = \Sigma_{n} m_ik_i$ where $k_i \in \mathbb{N}$ . Each term of this summation is of the form $(a_i + kb_i)k_i$ where $k, k_i \in \mathhbb{N}$ .

Consider one such term: $(a + kb)k_1$ . For different values of $k$ , we get terms of the form: $ak_1, (a + b)k_1, (a + 2b)k_1, (a + 3b)k_1,...$ Any number of the form $(a + nb)k$ can also be obtained from $ak_1 + (a +b)k_2 = a(k_1 + k_2) + bk_2$ by choosing the constants appropriately: $k = k_1 + k_2, nk = k_2$ Thus, the term $(a + kb)k_1$ can be written as $ak_1 + (a +b)k_2$

(Example: Consider the regular expression: $r = (aaa(aaaaa)*)$ . The set of lengths of words corresponding to its language is ${3 + 5k, k \in \mathbb{N}}$ . The corresponding set for $r*$ is ${3k_1 + 8k_2, k_1, k_2 \in \mathbb{N}}$ .)

So, $M* = \Sigma_{n} a_ik_i + c_ik_j$ where $c_i = a_i + b_i$ . From lemma 1, we know that the sum of any two terms in this summation is a set that satisfies property F. Thus, $M*$ is the sum of finitely many sets: each set satisfying property F. We can repeatedly combine the terms of all these sets and by the argument in the case for concatenation, we know that the resulting set also satisfies property F. Hence, $M*$ satisfies property F.

Since, any regular expression can be constructed only by the application of the above three steps: Union, Concatenation and Kleene Closure, the language corresponding to any regular expression satisfies property P.

$\Box$

The unary case for the W-problem We consider the special case of a unary relation on natural numbers corresponding to a formula $F(x)$ with only one free variable $x$ . This formula defines a set $S=\{x\ |\ F(x)\}.$
Claim: Set S is ultimately periodic.
Proof: Given formula $F(x)$ , there exists and automaton A with input alphabet $\{0,1\}$ which accepts the string $0^{x-1}10^*$ iff $F(x)=\mbox{true}.$
We define the automaton $A^*$ on $\{0,1\}$ which accepts the string $0^{x-1}1$ iff $F(x)=\mbox{true}.$ Note that $x=|0^{x-1}1|$ .
Note that the language corresponding to $0^{x-1}1,\ F(x)=\mbox{true}$ is regular iff the language corresponding to $0^{x-1}10^*,\ F(x)=\mbox{true}$ is regular. Indeed, if $0^{x-1}1$ is regular then $0^{x-1}10^*$ is the concatenation of the regular expressions $0^{x-1}1$ and $0^*$ and hence is regular. Conversely, if $0^{x-1}10^*$ is regular, then the language corresponding to $0^{x-1}1$ is the intersection of the languages corresponding to the regular expressions $0^{x-1}10^*$ and $0^*1$ , and hence is regular.
Furthermore, $A^*$ accepts string $0^{x-1}1$ iff $A$ accepts strings $0^{x-1}10^*$ .
Consider the language $L$ of $A.$ By the Little Problem, the set of lengths $M_L$ of words in $L$ is a finite union of linear sets $S_i(a,b).$ For all $i$ , the set $S_i(a,b)$ is the set of lengths of a subset of the words accepted by $A$ , all of the form $0^{x-1}1$ . Since $x=|0^{x-1}1|$ , $S_i(a,b)$ is a subset of the integers $x$ such that $0^{x-1}1$ is accepted by A, i.e. a subset of S. Furthermore, the finite disjoint union $\bigcup_{i}S_i(a,b)=M_L$ contains exactly all lengths of words accepted by $A$ , hence $\bigcup_{i}S_i(a,b)$ contains exactly all integers $x$ such that $0^{x-1}1$ is accepted by A, i.e. $\bigcup_{i}S_i(a,b)=S$ .
S is thus a finite union of linear sets $S_i(a,b).$ Hence by definition, S is ultimately periodic.
$\Box$

The binary case for the W-problem Consider a binary relation $R(x,y)$ in $L_R$ and the corresponding formula $F(x,y)$ with two free variables $x$ and $y$ . $F(x,y)$ defines a set $Q=\{(x,y)|\ F(x,y)\}$ . Elements of the set $Q$ are recognized by an automaton $A$ with parallel inputs and input alphabet $\Sigma = \{\binom{0}{0},\binom{0}{1},\binom{1}{0},\binom{1}{1}\}$ . The language L of A is a subset of the regular language corresponding to $\binom{0}{0}^*\binom{1}{0}\binom{0}{0}^*\binom{0}{1}\binom{0}{0}* + \binom{0}{0}^*\binom{0}{1}\binom{0}{0}^*\binom{1}{0}\binom{0}{0}^* + \binom{0}{0}^*\binom{1}{1}\binom{0}{0}^*$ . Thus the input string to the automaton is of the form $\binom{0}{0}^{k_1}\binom{1}{0}\binom{0}{0}^{k_2}\binom{0}{1}\binom{0}{0}^{k_3}$ or $\binom{0}{0}^{k_4}\binom{0}{1}\binom{0}{0}^{k_5}\binom{1}{0}\binom{0}{0}^{k_6}$ or $\binom{0}{0}^{k_7}\binom{1}{1}\binom{0}{0}^{k_8}$ .
Claim A pair $(x,y)$ belongs to some set $Q=\{(x,y)|\F(x,y)\}$ for some $F$ iff the sets $\{k_i\}_i$ of possible lengths of the exponents $k_i$ are ultimately periodic.
Proof

Only if: Let $(x,y)$ belong to $Q=\{(x,y)|\F(x,y)\}$ . There exists an automaton A with parallel inputs which accepts the corresponding input string.

W.l.o.g, assume $x \leq y.$