Differences
This shows you the differences between two versions of the page.
minimization_of_state_machines [2008/09/24 09:58] vkuncak |
minimization_of_state_machines [2015/04/21 17:32] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Minimization of Deterministic Finite State Machines ====== | ||
- | We consider **deterministic** [[finite state machine]] $M = (\Sigma,Q,\delta,q_0,F)$. | ||
- | |||
- | **Goal:** build a state machine $M'$ with the least number of states that accepts the language $L(M)$. | ||
- | * we obtain a space-efficient, executable representation of a regular language | ||
- | |||
- | This is the process of //minimization// of $M$. | ||
- | * an easy case of minimizing size of 'generated code' in compiler | ||
- | |||
- | We say that state machine $M$ distinguishes strings $w$ and $w'$ iff it is not the case that ($w \in L(M)$ iff $w' \in L(M)$). | ||
- | |||
- | |||
- | ==== Minimization Algorithm ==== | ||
- | |||
- | === Step 1: Remove unreachable states === | ||
- | |||
- | We first discard states that are not reachable from the initial state--such states are useless. In resulting machine, for each state $q$ there exists a string $s$ such that $\delta(q_0,s)=q$, let $s_q$ one such string of minimal length. | ||
- | |||
- | === (Main) Step 2: Compute Non-Equivalent States === | ||
- | |||
- | We wish to merge states $q$ and $q'$ into same group as long as they "behave the same" on all future strings $w$, i.e. | ||
- | \[ | ||
- | \delta(q,w) \in F \mbox{ iff } \delta(q',w) \in F \ \ \ (*) | ||
- | \] | ||
- | for all $w$. | ||
- | |||
- | If the condition $(*)$ above holds, we called states **equivalent**. If the condition does **not** hold, we call states $q$,$q'$ **non-equivalent**. | ||
- | |||
- | States $q$ and $q'$ are $w$-non-equivalent if it is not the case that ($\delta(q,w) \in F \mbox{ iff } \delta(q',w) \in F$). | ||
- | |||
- | Two states are non-equivalent iff they are $w$-non-equivalent for some string $w$. | ||
- | |||
- | Observe that | ||
- | - if $q \in F$ and $q' \notin F$ then $q$ and $q'$ are $\epsilon$-non-equivalent | ||
- | - if $q$ and $q'$ are $w$-non-equivalent and we have $\delta(r,a)=q$, $\delta(r',a)=q'$ for some symbol $a \in \Sigma$, then $r$ and $r'$ are $aw$-non-equivalent | ||
- | - conversely, if $r$ and $r'$ are $w'$-non-equivalent and $w$ is not an empty string, then for $w'=aw$ the states $\delta(r,a)$ and $\delta(r',a)$ are $w$-non-equivalent | ||
- | |||
- | These observations lead to an iterative algorithm for computing non-equivalence relation $\nu$ | ||
- | - initially put $\nu = (Q \cap F) \times (Q \setminus F)$ (only final and non-final states are initially non-equivalent) | ||
- | - repeat until no more changes: if $(r,r') \notin \nu$ and there is $a \in \Sigma$ such that $(\delta(r,a),\delta(r',a)) \in \nu$, then | ||
- | \[ | ||
- | \nu := \nu \cup \{(r,r')\} | ||
- | \] | ||
- | |||
- | === Step 3: Merge States that are not non-equivalent === | ||
- | |||
- | Relation $Q^2 \setminus \nu$ is an equivalence relation $\sim$. We define the 'factor automaton' by merging equivalent states: | ||
- | * the initial state is ${q_0}/_{\sim}$ | ||
- | * $Q/_{\sim} = \{ \{ y \mid x \sim y \} \mid x \in Q \}$ | ||
- | * $F/_{\sim} = \{ \{ y \mid x \sim y \} \mid x \in F \}$ | ||
- | * relation $r = \{ ([x],[y]) \mid (x,y) \in \delta \}$ is a function, and we can use it to define a new deterministic automaton (there is a transition in the resulting automaton iff there is a transition between two states in the original automaton) | ||
- | |||
- | This is the minimal automaton. | ||
- | |||
- | ==== Correctness of Constructed Automaton ==== | ||
- | |||
- | Clearly, this algorithm terminates because in worst case all states become non-equivalent. We will prove below that the resulting value $\nu$ is the non-equivalence relation, i.e. the complement of relation given by $(*)$ above. | ||
- | |||
- | By induction, we can easily prove that if $(q,q') \in \nu$, then $q$ and $q'$ are non-equivalent. Similarly we can show that if $q$ and $q'$ are $w$-non-equivalent for $w$ of length $k$, then $(q,q') \in \nu$ by step $k$ of the algorithm. Because the algorithm terminates, this completes the proof that $\nu$ is the non-equivalence relation. | ||
- | |||
- | Consequently, $Q^2 \setminus \nu$ is the equivalence relation. From the definition of this equivalence it follows that if two states are equivalent, then so is the result of applying $\delta$ to them. Therefore, we have obtained a well-defined deterministic automaton. | ||
- | |||
- | |||
- | |||
- | ==== Minimality of Constructed Automaton ==== | ||
- | |||
- | Note that if two distinct states are non-equivalent, there is $w$ such that states $\delta(q_0,s_q w)$ and $\delta(q_0,s_{q'} w)$ have different acceptance, so $M$ distinguishes $s_q w$ and $s_{q'}w$. Now, if we take any other state machine $M' = (\Sigma,Q',\delta',q'_0,F')$ with $L(M')=L(M)$, it means that $\delta'(q'_0,s_q) \neq \delta'(q'_0,s_{q'})$, otherwise $M'$ would not distinguish $s_q w$ and $s_{q'} w$. So, if there are $K$ pairwise non-equivalent states in $M$, then a minimal finite state machine for $L(M)$ must have at least $K$ states. Note that if the algorithm constructs a state machine with $K$ states, it means that $Q^2 \setminus \tau$ had $K$ equivalence relations, which means that there exist $K$ non-equivalent states. Therefore, any other deterministic machine will have at least $K$ states, proving that the constructed machine is minimal. | ||
- | |||
- | === Example === | ||
- | |||
- | Construct automaton recognizing | ||
- | * language {=,<nowiki><=</nowiki>} | ||
- | * language {=,<nowiki><=</nowiki>,<nowiki>==</nowiki>} | ||
- | Minimize the automaton. |