LARA – Lab for Automated Reasoning and Analysis -

This is an old revision of the document!

Introduction: We know that Weak Monadic Second-order theory of 1 Successor (WS1S) is decidable through automata, that is, the set of satisfying interpretations of a subformula is represented by a finite-state automaton [Buc60b, Elg61]. The decision procedures (using the automata technique) have been implemented efficiently (for WS1S with certain restrictions) in MONA. The main problem with this approach is exponential blow-up in the states of the automaton due to nested quantifiers [Kla01]. We were interested in finding another way of deciding WS1S without using the automata. We began by investigating whether we can characterize relations in WS1S in a way that helps us eliminate quantifiers. Since we couldn't get any substantial results, we began looking at a subset of WS1S. This paper describes the key results that we obtained.

Conclusions and Future Directions: In the course of our mini-project we proved an interesting result about the lengths of words of a regular language. We used this result to characterize relations in the language $L^R$ . We hope that this is a step in the direction of our original aim: characterizing relations in WS1S.

START

$N = \{0,1,2,\ldots\}$

Recall the syntax for WS1S:

$\begin{equation*} F ::= v \subseteq v \mid s(v,v) \mid F \lor F \mid \lnot F \mid \exists v.F \end{equation*}$

The language $L^R$ is the language of WS1S with the restriction that there are no free second-order (set) variables. A w-relation is a relation on first-order variables expressible in $L^R$ .

The W-Problem

Characterize the set of w-relations expressible in $L^R$ .

We want to characterize the set by proving that a w-relation is recognized by an automaton (with parallel input) that accepts strings of the form ABC where the length of B is completely described by a union of a finite number of sets S(a,b) for some finite number of tuples (a,b) for given constants a and b.

Here, $S(a,b) = \{ a + k b, k \in \mathbb{N} \}$ .

Terminology clarification

Define a set $S$ to be ultimately periodic iff its characteristic function $x \mapsto \mbox{if}\ x\in S\ \mbox{then} \ 1\ \mbox{else}\ 0$ is ultimately periodic.

A sequence $s : N \to \{0,1\}$ is ultimately periodic iff there exists $n_0 \in N$ and $v \in N$ such that $\forall n \geq n_0.\ s_n = s_{n+v}$ .

Let $S(a,b) = \{ a + k b, k \in \mathbb{N} \}$ .

Observe that a set is ultimately periodic iff it is a union of a finite number of sets of the form $S(a,b)$ .

proof

Consider an ultimately periodic set $S$ with some $v$ and $n_0$ . Consider the set $I \subseteq \{1,\ldots,n_0-1\}$ of numbers $i < n_0$ belonging to $S$ , and the set $J \subseteq \{n_0,\ldots,n_0+v-1\}$ of numbers $j$ between $n_0$ and $n_0+v-1$ belonging to $S$ . Then $S=\bigcup_{i \in I}\{i+0k\ |\ k\in \mathbb{N}\} \cup \bigcup_{j \in J}\{j+vk\ |\ k\in \mathbb{N}\}.$

We prove the other side of the equivalence by induction.

Base Case: Clearly $S(a,b)$ is ultimately periodic ( $n_0 = a$ , $v=b$ ).
Induction: The union of an ultimately periodic set ( $n_{0old}$ , $v_{old}$ ) and a set of the form $S(a,b)$ is an ultimately periodic set with $n_0 = \max(n_{0old},a)$ and $v=b\times v_{old}.$ TO EXPLAIN WITH MORE DETAIL IF NOT OBVIOUS

Little Problem

Let L be a regular language over an alphabet $\Sigma$ and

$M = \{ |w|, w \in L \}$

be the set of lengths of words in L. Prove that M is a union of a finite number of sets S(a,b) for some finite number of tuples (a,b). Here, $S(a,b) = \{ a + k b, k \in \mathbb{N} \}$ .

Proof

We prove the claim by induction on the structure of the regular expression representing the language.
Notation: We say that a set has property F if it is a union of finitely many sets S(a,b) for some tuples (a,b). We say a language L (or the corresponding regular expression) has property P if its set of lengths M has property F.

Hence, we need to prove that any regular language has property P.

Recall that Regular Expressions over an alphabet $\alpha^*$ are all strings over the alphabet $\{\Sigma \cup \{(,),\phi, \cup, *\}\}$ that can be obtained as follows:

(1) $\phi$ and each member of $\alpha$ is a regular expression.
(2) If $\alpha$ and $\beta$ are regular expressions, then so is their concatenation: $(\alpha\beta)$ .
(3) If $\alpha$ and $\beta$ are regular expressions, then so is their union: $(\alpha \cup \beta)$ .
(4) If $\alpha$ is a regular expression, then so is its Kleene closure: $\alpha*$ .
(5) Nothing is a regular expression unless it follows from (1) through (4).

A Regular Language, $L$ is generated as follows:

(1) $L(\phi) = \phi$ and $L(a) = a$ for each $a \in \Sigma$ .
(2) If $\alpha$ and $\beta$ are regular expressions, then $L((\alpha\beta)) = L(\alpha)L(\beta)$ .
(3) If $\alpha$ and $\beta$ are regular expressions, then $L((\alpha \cup \beta)) = L(\alpha) \cup (\beta)$ .
(4) If $\alpha$ is a regular expression, then $L((\alpha)*) = L(\alpha)*$ .

Base Case:
The length of a single character in $\Sigma$ is 1. Thus the length of any word corresponding to a single character belongs to the set $\{1+0.k, k \in \mathbb{N}\}$ . Hence any such word has property P.

To prove the inductive step, we need the following lemma:

Lemma 1 The set $S=\{a+bk_1+ck_2\}$ where $k_1, k_2 \in N$ satisfies property F, that is, $S$ is equal to a union of finitely many sets of the form $\{\alpha + \beta k\}$ , $k \in N$ .

Proof of Lemma 1

Consider the following two (mutually disjoint and exhaustive) cases:

For $b,c,d \in \mathbb{N}$ ,

(1) gcd(b,c)=1
(2) gcd(b,c)=d, $d\neq 1$ ,

Case (1): gcd(b,c)=1 Consider the numbers contained in the set $\{bk_1 + ck_2\}.$ Any such number is congruent to $ck_2 \pmod{b}.$ Letting $k_2$ range over $0,\ldots,b-1\}$ , we obtain for each value of $k_2$ a distinct congruence class $\pmod{b}$ : $[0], [c], 2c],\ldots,[(b-1)c].$

The congruence classes are distinct by the following argument: assume there exist $j$ and $l$ such that $0\leq j,l<b$ and $jc=lc \pmod{b}.$ Then $jc+m b = lc + n b$ for some $m,n.$ Hence $(l-j)c = (m-n)b.$ Since b divides $(n-m)b,$ it must divide $(l-j)c.$ But gcd(b,c)=1 hence b must divide $l-j$ . But $l-j<b$ , hence we must have $l-j=0.$

Since there are b distinct congruence classes $\pmod{b},$ the set $\{bk_1 + ck_2\}$ contains all natural numbers beyond $n_0 = (b-1)(c-1)$ . This is because of the following theorem from Elementary Number Theory: If gcd(b,c)=1 and n >= (b-1)(c-1). Then bx+cy=n has a non-negative solution, that is, one in which both x and y are non-negative integers. There are also finitely many numbers below $n_0$ that are contained in the set $\{bk_1 + ck_2\}$ (For example, $0$ is in the set).

Hence, the set $\{a + bk_1 + ck_2\}$ is equal to the union of a finite number of constants (less than $a + n_0$ ) and the set $\{a + n_0 + k\}$ . Each constant $c$ can be represented in the set $\{c + 0.k, k\in \mathbb{N}\}$ . In other words, if gcd(b,c) = 1, then the set $S=\{a+bk_1+ck_2\}$ where $k_1, k_2 \in N$ satisfies property F.

Case(2): gcd(b,c)=d, $d\neq 1$ Any number of the form $bk_1 + ck_2$ can be written as $d(mk_1 + nk_2)$ where $b = md$ and $c = nd$ and $gcd(m,n) = 1$ . From (1), we know that the set $T = (mk_1 + nk_2)$ satisfies property F. The set $d(mk_1 + nk_2)$ can be obtained from $T$ by multiplying the constant factors (a and b) in each set of $T$ by $d$ and so, satisfies property F.

Hence, if gcd(b,c) = d , then the set $S=\{a+bk_1+ck_2\}$ where $k_1, k_2 \in N$ satisfies property F.

$\Box$

Inductive step:
We prove that the operations union, concatenation and Kleene closure of two regular expressions satisfying property P yield a regular expression satisfying property P.

Union: Let $r_1$ and $r_2$ be two regular expressions satisfying property P. Then, their corresponding sets of lengths $M_1$ and $M_2$ satisfy property F. The set of lengths $M$ of $r=r_1 \cup r_2$ is simply $M=M_1 \cup M_2$ . Hence $r$ satisfies property P.

Concatenation: Let $r_1$ and $r_2$ be two regular expressions satisfying property P. Then, their corresponding sets of lengths $M_1$ and $M_2$ satisfy property F. The set of lengths of $r=r_1r_2$ is the finite set $Q = \{t_1 + t_2, t_1 \in M_1 \mbox, t_2 \in M_2\}.$ This set contains elements of the form $a_1 + bk_1 + a_2 + ck_2= a + bk_1 + ck_2.$ By Lemma 1, $\{a+bk_1+ck_2\}$ satisfies property F. Set Q therefore, also satisfies property F. Thus, $r$ satisfies property P.

Kleene Closure: Let $r$ be a regular expression satisfying property P. Let the corresponding set of lengths be $M$ that satisfies property F. $M = \bigcup_{n} m_i$ where $m_i$ is a set of the form S(a,b) and $n$ is finite. The regular expression $r*$ can contain any number of repetitions (including zero) of any of the sub-regular expressions. Hence, the possible lengths of words in $r*$ is given by $M* = \Sigma_{n} m_ik_i$ where $k_i \in \mathbb{N}$ . Each term of this summation is of the form $(a_i + kb_i)k_i$ where $k, k_i \in \mathhbb{N}$ .

Consider one such term: $(a + kb)k_1$ . For different values of $k$ , we get terms of the form: $ak_1, (a + b)k_1, (a + 2b)k_1, (a + 3b)k_1,...$ Any number of the form $(a + nb)k$ can also be obtained from $ak_1 + (a +b)k_2 = a(k_1 + k_2) + bk_2$ by choosing the constants appropriately: $k = k_1 + k_2, nk = k_2$ Thus, the term $(a + kb)k_1$ can be written as $ak_1 + (a +b)k_2$

(Example: Consider the regular expression: $r = (aaa(aaaaa)*)$ . The set of lengths of words corresponding to its language is ${3 + 5k, k \in \mathbb{N}}$ . The corresponding set for $r*$ is ${3k_1 + 8k_2, k_1, k_2 \in \mathbb{N}}$ .)

So, $M* = \Sigma_{n} a_ik_i + c_ik_j$ where $c_i = a_i + b_i$ . From lemma 1, we know that the sum of any two terms in this summation is a set that satisfies property F. Thus, $M*$ is the sum of finitely many sets: each set satisfying property F. We can repeatedly combine the terms of all these sets and by the argument in the case for concatenation, we know that the resulting set also satisfies property F. Hence, $M*$ satisfies property F.

Since, any regular expression can be constructed only by the application of the above three steps: Union, Concatenation and Kleene Closure, the language corresponding to any regular expression satisfies property P.

$\Box$

The unary case for the W-problem We consider the special case of a unary relation on natural numbers corresponding to a formula $F(x)$ with only one free variable $x$ . This formula defines a set $S=\{x\ |\ F(x)\}.$
Claim: Set S is ultimately periodic.
Proof: Given formula $F(x)$ , there exists and automaton A with input alphabet $\{0,1\}$ which accepts the string $0^{x-1}10^*$ iff $F(x)=\mbox{true}.$
We define the automaton $A^*$ on $\{0,1\}$ which accepts the string $0^{x-1}1$ iff $F(x)=\mbox{true}.$ Note that $x=|0^{x-1}1|$ .
Note that the language corresponding to $0^{x-1}1,\ F(x)=\mbox{true}$ is regular iff the language corresponding to $0^{x-1}10^*,\ F(x)=\mbox{true}$ is regular. Indeed, if $0^{x-1}1$ is regular then $0^{x-1}10^*$ is the concatenation of the regular expressions $0^{x-1}1$ and $0^*$ and hence is regular. Conversely, if $0^{x-1}10^*$ is regular, then the language corresponding to $0^{x-1}1$ is the intersection of the languages corresponding to the regular expressions $0^{x-1}10^*$ and $0^*1$ , and hence is regular.
Furthermore, $A^*$ accepts string $0^{x-1}1$ iff $A$ accepts strings $0^{x-1}10^*$ .
Consider the language $L$ of $A.$ By the Little Problem, the set of lengths $M_L$ of words in $L$ is a finite union of linear sets $S_i(a,b).$ For all $i$ , the set $S_i(a,b)$ is the set of lengths of a subset of the words accepted by $A$ , all of the form $0^{x-1}1$ . Since $x=|0^{x-1}1|$ , $S_i(a,b)$ is a subset of the integers $x$ such that $0^{x-1}1$ is accepted by A, i.e. a subset of S. Furthermore, the finite disjoint union $\bigcup_{i}S_i(a,b)=M_L$ contains exactly all lengths of words accepted by $A$ , hence $\bigcup_{i}S_i(a,b)$ contains exactly all integers $x$ such that $0^{x-1}1$ is accepted by A, i.e. $\bigcup_{i}S_i(a,b)=S$ .
S is thus a finite union of linear sets $S_i(a,b).$ Hence by definition, S is ultimately periodic.
$\Box$

The binary case for the W-problem Consider a binary relation $R(x,y)$ in $L_R$ and the corresponding formula $F(x,y)$ with two free variables $x$ and $y$ . $F(x,y)$ defines a set $Q=\{(x,y)|\ F(x,y)\}$ . Elements of the set $Q$ are recognized by an automaton $A$ with parallel inputs and input alphabet $\Sigma = \{\binom{0}{0},\binom{0}{1},\binom{1}{0},\binom{1}{1}\}$ . The language L of A is a subset of the regular language corresponding to $\binom{0}{0}^*\binom{1}{0}\binom{0}{0}^*\binom{0}{1}\binom{0}{0}^* + \binom{0}{0}^*\binom{0}{1}\binom{0}{0}^*\binom{1}{0}\binom{0}{0}^* + \binom{0}{0}^*\binom{1}{1}\binom{0}{0}^*$ . Thus the input string to the automaton is of the form $\binom{0}{0}^{k_1}\binom{1}{0}\binom{0}{0}^{k_2}\binom{0}{1}\binom{0}{0}^{k_3}$ or $\binom{0}{0}^{k_4}\binom{0}{1}\binom{0}{0}^{k_5}\binom{1}{0}\binom{0}{0}^{k_6}$ or $\binom{0}{0}^{k_7}\binom{1}{1}\binom{0}{0}^{k_8}$ .
Claim A pair $(x,y)$ belongs to some set $Q=\{(x,y)|\F(x,y)\}$ for some $F$ iff the sets $\{k_i\}_i$ of possible lengths of the exponents $k_i$ are ultimately periodic.
Proof

Only if: Let $(x,y)$ belong to $Q=\{(x,y)|\F(x,y)\}$ . There exists an automaton A with parallel inputs which accepts the corresponding input string.

W.l.o.g, assume $x \leq y.$ Then the input string can be of one of the two forms:
(a) $\binom{0}{0}^{k_1}\binom{1}{0}\binom{0}{0}^{k_2}\binom{0}{1}\binom{0}{0}^{k_3}$
(b) $\binom{0}{0}^{k_7}\binom{1}{1}\binom{0}{0}^{k_8}$
case (a): We want to show that the sets $\{k_1\}$ , $\{k_2\}$ and $\{k_3\}$ are ultimately periodic. Due to the little problem, it is sufficient to show that the language generated by the unary alphabet $\binom{0}{0}$ : $\binom{0}{0}^{k_i}, i = 1,2,3$ is regular. We show that $\binom{0}{0}^{k_2}$ is regular by constructing an accepting automaton, $A = (\Sigma, \delta, s, f, S)$ . Let $M = (\Sigma^{'}, \delta^{'}, s^{'}, f^{'}, S^{'})$ be the automaton that accepts the original input string (a). Let $q$ be the accepting run in $M$ . In $q$ , there exist unique states $m$ and $n$ such that $\delta^{'}(m, \binom{1}{0}) = n$ . Define $s = n$ . Similarly there exist unique $g, h$ such that $\delta^{'}(g, \binom{0}{1}) = h$ . Define $f = g$ . $S$ contains all the states of $S^{'}$ that are in $q$ between $s$ and $f$ . And, $\delta$ is the restriction of $\delta^{'}$ to the transitions in $q$ between $s$ and $f$ . $\Sigma = \{\binom{0}{0}\}$ . Thus the automaton is defined unambiguously.

We can similarly construct automata for the languages $\binom{0}{0}^{k_1}$ and $\binom{0}{0}^{k_3}$ . In the former case, the start state corresponds to $s^{'}$ and the final state is the state (in $q$ ) from which there is a transition on consuming $\binom{1}{0}$ . In the latter case, the final state corresponds to $f^{'}$ and the start state is that state in $q$ that is reached after consuming $\binom{0}{1}$ .

If: If $\{k_i\}$ is an ultimately periodic set. then the expression $p^k$ where $p$ belongs to a unary alphabet and $k \in \{k_i\}$ is a regular expression. This is because for each set $S(a,b)$ one can construct a regular expression (ppp…a times) concatenated with the kleene closure of (ppp…b times). Since $\{k_i\}$ is the finite union of sets of the form $S(a,b)$ , one can construct a regular expression which is the union of all the regular expressions formed as described above. Thus, $\binom{0}{0}^{k}$ is a regular expression for $k \in \{k_i\}$ . We now extend our alphabet to include $\binom{1}{0}$ , $\binom{0}{1}$ and $\binom{1}{1}$ . The above argument for $\binom{0}{0}^{k}$ still holds. (a), (b) and © are all regular expressions formed by concatenation. Thus, the input strings are accepted by finite automata. They are satisfying interpretations of some formula in $L^R$ because (a) can be written as $\exists k_1, k_2, x = k_1 \wedge y = x + k_2$ , and (b) is simply $x = y$ . © is symmetric to (a).

The above reasoning can be extended to the n-ary case. The input string has n non-zero elements and n+1 expressions of the form $(\{(0...0)^{-1}\})^k$ . A tuple $(x_1, x_2,...x_n)$ belongs to an n-ary relation iff the possible values of each exponent $k$ forms an ultimately periodic set.

References:
[Buc60a] Julius Richard Buchi. On a decision method in restricted second-order arithmetic. In Proc. Internat. Cong. on Logic, Methodol., and Philos. of Sci. Stanford University Press, 1960.
[Elg61] Calvin C. Elgot. Decision problems of Finite automata design and related arithmetics. Transactions of the American Mathematical Society, 98:21{52, 1961.
[Kla01] Nils Klarlund, Anders Moller. MONA Version 1.4 User Manual. BRICS, Department of Computer Science, University of Aarhus, 2001.