Minimization of Deterministic Finite State Machines
We consider deterministic finite state machine .
Goal: build a state machine with the least number of states that accepts the language .
- we obtain a space-efficient, executable representation of a regular language
This is the process of minimization of .
- an easy case of minimizing size of 'generated code' in compiler
We say that state machine distinguishes strings and iff it is not the case that ( iff ).
Minimization Algorithm
Step 1: Remove unreachable states
We first discard states that are not reachable from the initial state–such states are useless. In resulting machine, for each state there exists a string such that , let one such string of minimal length.
(Main) Step 2: Compute Non-Equivalent States
We wish to merge states and into same group as long as they “behave the same” on all future strings , i.e.
for all .
If the condition above holds, we called states equivalent. If the condition does not hold, we call states , non-equivalent.
States and are -non-equivalent if it is not the case that ().
Two states are non-equivalent iff they are -non-equivalent for some string .
Observe that
- if and then and are -non-equivalent
- if and are -non-equivalent and we have , for some symbol , then and are -non-equivalent
- conversely, if and are -non-equivalent and is not an empty string, then for the states and are -non-equivalent
These observations lead to an iterative algorithm for computing non-equivalence relation
- initially put (only final and non-final states are initially non-equivalent)
- repeat until no more changes: if and there is such that , then
Step 3: Merge States that are not non-equivalent
Relation is an equivalence relation . We define the 'factor automaton' by merging equivalent states:
- the initial state is
- relation is a function, and we can use it to define a new deterministic automaton (there is a transition in the resulting automaton iff there is a transition between two states in the original automaton)
This is the minimal automaton.
Correctness of Constructed Automaton
Clearly, this algorithm terminates because in worst case all states become non-equivalent. We will prove below that the resulting value is the non-equivalence relation, i.e. the complement of relation given by above.
By induction, we can easily prove that if , then and are non-equivalent. Similarly we can show that if and are -non-equivalent for of length , then by step of the algorithm. Because the algorithm terminates, this completes the proof that is the non-equivalence relation.
Consequently, is the equivalence relation. From the definition of this equivalence it follows that if two states are equivalent, then so is the result of applying to them. Therefore, we have obtained a well-defined deterministic automaton.
Minimality of Constructed Automaton
Note that if two distinct states are non-equivalent, there is such that states and have different acceptance, so distinguishes and . Now, if we take any other state machine with , it means that , otherwise would not distinguish and . So, if there are pairwise non-equivalent states in , then a minimal finite state machine for must have at least states. Note that if the algorithm constructs a state machine with states, it means that had equivalence relations, which means that there exist non-equivalent states. Therefore, any other deterministic machine will have at least states, proving that the constructed machine is minimal.
Example
Construct automaton recognizing
- language {=,<=}
- language {=,<=,==}
Minimize the automaton.