Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
minimization_of_state_machines [2008/09/11 18:21] vkuncak |
minimization_of_state_machines [2015/04/21 17:32] (current) |
||
---|---|---|---|
Line 4: | Line 4: | ||
**Goal:** build a state machine $M'$ with the least number of states that accepts the language $L(M)$. | **Goal:** build a state machine $M'$ with the least number of states that accepts the language $L(M)$. | ||
+ | * we obtain a space-efficient, executable representation of a regular language | ||
This is the process of //minimization// of $M$. | This is the process of //minimization// of $M$. | ||
+ | * an easy case of minimizing size of 'generated code' in compiler | ||
We say that state machine $M$ distinguishes strings $w$ and $w'$ iff it is not the case that ($w \in L(M)$ iff $w' \in L(M)$). | We say that state machine $M$ distinguishes strings $w$ and $w'$ iff it is not the case that ($w \in L(M)$ iff $w' \in L(M)$). | ||
+ | |||
+ | |||
+ | ==== Minimization Algorithm ==== | ||
=== Step 1: Remove unreachable states === | === Step 1: Remove unreachable states === | ||
Line 16: | Line 21: | ||
We wish to merge states $q$ and $q'$ into same group as long as they "behave the same" on all future strings $w$, i.e. | We wish to merge states $q$ and $q'$ into same group as long as they "behave the same" on all future strings $w$, i.e. | ||
- | \[ | + | \begin{equation*} |
- | \delta(q,w) \in F \mbox{ iff } \delta(q',w) \in F | + | \delta(q,w) \in F \mbox{ iff } \delta(q',w) \in F \ \ \ (*) |
- | \] | + | \end{equation*} |
for all $w$. | for all $w$. | ||
- | If the condition above holds, we called states **equivalent**. If the condition does **not** hold, we call states $q$,$q'$ **non-equivalent**. States $q$ and $q'$ are $w$-non-equivalent | + | If the condition $(*)$ above holds, we called states **equivalent**. If the condition does **not** hold, we call states $q$,$q'$ **non-equivalent**. |
- | if it is not the case that ($\delta(q,w) \in F \mbox{ iff } \delta(q',w) \in F$). | + | |
+ | States $q$ and $q'$ are $w$-non-equivalent if it is not the case that ($\delta(q,w) \in F \mbox{ iff } \delta(q',w) \in F$). | ||
+ | |||
+ | Two states are non-equivalent iff they are $w$-non-equivalent for some string $w$. | ||
Observe that | Observe that | ||
- | - if $q \in F$ and $q' \notin F$ then (taking $w$ to be empty string) we conclude that $q$ and $q'$ are non-equivalent. | + | - if $q \in F$ and $q' \notin F$ then $q$ and $q'$ are $\epsilon$-non-equivalent |
- | - if $q$ and $q'$ are non-equivalent and we have $\delta(r,a)=q$, $\delta(r',a)=q'$ for some symbol $a \in \Sigma$, then $r$ and $r'$ are non-equivalent. | + | - if $q$ and $q'$ are $w$-non-equivalent and we have $\delta(r,a)=q$, $\delta(r',a)=q'$ for some symbol $a \in \Sigma$, then $r$ and $r'$ are $aw$-non-equivalent |
+ | - conversely, if $r$ and $r'$ are $w'$-non-equivalent and $w$ is not an empty string, then for $w'=aw$ the states $\delta(r,a)$ and $\delta(r',a)$ are $w$-non-equivalent | ||
- | These two observations lead to an iterative algorithm for computing non-equivalence relation $\nu$ | + | These observations lead to an iterative algorithm for computing non-equivalence relation $\nu$ |
- | - initially put $\nu = (Q \cap F) \times (Q \setminus F)$ | + | - initially put $\nu = (Q \cap F) \times (Q \setminus F)$ (only final and non-final states are initially non-equivalent) |
- | - repeat until no more changes: if for some states $(r,r') \notin \nu$ there is $a \in \Sigma$ such that $(\delta(r,a),\delta(r',a)) \in \nu$, then $\nu := \nu \cup \{(r,r')\}$ | + | - repeat until no more changes: if $(r,r') \notin \nu$ and there is $a \in \Sigma$ such that $(\delta(r,a),\delta(r',a)) \in \nu$, then |
+ | \begin{equation*} | ||
+ | \nu := \nu \cup \{(r,r')\} | ||
+ | \end{equation*} | ||
- | ==== Step 3: Merge States that are not non-equivalent ==== | + | === Step 3: Merge States that are not non-equivalent === |
- | Relation $Q^2 \setminus \nu$ is an equivalence relation and the 'factor automaton' obtained by merging equivalent states is well defined. | + | Relation $Q^2 \setminus \nu$ is an equivalence relation $\sim$. We define the 'factor automaton' by merging equivalent states: |
+ | * the initial state is ${q_0}/_{\sim}$ | ||
+ | * $Q/_{\sim} = \{ \{ y \mid x \sim y \} \mid x \in Q \}$ | ||
+ | * $F/_{\sim} = \{ \{ y \mid x \sim y \} \mid x \in F \}$ | ||
+ | * relation $r = \{ ([x],[y]) \mid (x,y) \in \delta \}$ is a function, and we can use it to define a new deterministic automaton (there is a transition in the resulting automaton iff there is a transition between two states in the original automaton) | ||
This is the minimal automaton. | This is the minimal automaton. | ||
- | ==== Correctness of Minimization ==== | + | ==== Correctness of Constructed Automaton ==== |
- | Clearly, this algorithm terminates because in worst case all states become non-equivalent. We will prove below that the resulting value $\nu$ is the non-equivalence relation. | + | Clearly, this algorithm terminates because in worst case all states become non-equivalent. We will prove below that the resulting value $\nu$ is the non-equivalence relation, i.e. the complement of relation given by $(*)$ above. |
By induction, we can easily prove that if $(q,q') \in \nu$, then $q$ and $q'$ are non-equivalent. Similarly we can show that if $q$ and $q'$ are $w$-non-equivalent for $w$ of length $k$, then $(q,q') \in \nu$ by step $k$ of the algorithm. Because the algorithm terminates, this completes the proof that $\nu$ is the non-equivalence relation. | By induction, we can easily prove that if $(q,q') \in \nu$, then $q$ and $q'$ are non-equivalent. Similarly we can show that if $q$ and $q'$ are $w$-non-equivalent for $w$ of length $k$, then $(q,q') \in \nu$ by step $k$ of the algorithm. Because the algorithm terminates, this completes the proof that $\nu$ is the non-equivalence relation. | ||
Consequently, $Q^2 \setminus \nu$ is the equivalence relation. From the definition of this equivalence it follows that if two states are equivalent, then so is the result of applying $\delta$ to them. Therefore, we have obtained a well-defined deterministic automaton. | Consequently, $Q^2 \setminus \nu$ is the equivalence relation. From the definition of this equivalence it follows that if two states are equivalent, then so is the result of applying $\delta$ to them. Therefore, we have obtained a well-defined deterministic automaton. | ||
+ | |||
+ | |||
==== Minimality of Constructed Automaton ==== | ==== Minimality of Constructed Automaton ==== | ||
- | Note that if two states are non-equivalent, there is $w$ such that states $\delta(q_0,s_q w)$ and $\delta(q_0,s_{q'} w)$ have different acceptance, so $M$ distinguishes $s_q w$ and $s_{q'}w$. Now, if we take any other state machine $M' = (\Sigma,Q',\delta',q'_0,F')$ with $L(M')=L(M)$, it means that $\delta'(q'_0,s_q) \neq \delta'(q'_0,s_{q'})$, otherwise $M'$ would not distinguish $s_q w$ and $s_{q'} w$. So, if there are $K$ pairwise non-equivalent states in $M$, then a minimal finite state machine for $L(M)$ must have at least $K$ states. Note that if the algorithm constructs a state machine with $K$ states, it means that $Q^2 \setminus \tau$ had $K$ equivalence relations, which means that there exist $K$ non-equivalent states. Therefore, any other deterministic machine will have at least as many states, proving that the constructed machine is minimal. | + | Note that if two distinct states are non-equivalent, there is $w$ such that states $\delta(q_0,s_q w)$ and $\delta(q_0,s_{q'} w)$ have different acceptance, so $M$ distinguishes $s_q w$ and $s_{q'}w$. Now, if we take any other state machine $M' = (\Sigma,Q',\delta',q'_0,F')$ with $L(M')=L(M)$, it means that $\delta'(q'_0,s_q) \neq \delta'(q'_0,s_{q'})$, otherwise $M'$ would not distinguish $s_q w$ and $s_{q'} w$. So, if there are $K$ pairwise non-equivalent states in $M$, then a minimal finite state machine for $L(M)$ must have at least $K$ states. Note that if the algorithm constructs a state machine with $K$ states, it means that $Q^2 \setminus \tau$ had $K$ equivalence relations, which means that there exist $K$ non-equivalent states. Therefore, any other deterministic machine will have at least $K$ states, proving that the constructed machine is minimal. |
+ | |||
+ | === Example === | ||
+ | Construct automaton recognizing | ||
+ | * language {=,<nowiki><=</nowiki>} | ||
+ | * language {=,<nowiki><=</nowiki>,<nowiki>==</nowiki>} | ||
+ | Minimize the automaton. |