Minimization of Deterministic Finite State Machines
We consider deterministic finite state machine .
Goal: build a state machine with the least number of states that accepts the language
.
- we obtain a space-efficient, executable representation of a regular language
This is the process of minimization of .
- an easy case of minimizing size of 'generated code' in compiler
We say that state machine distinguishes strings
and
iff it is not the case that (
iff
).
Minimization Algorithm
Step 1: Remove unreachable states
We first discard states that are not reachable from the initial state–such states are useless. In resulting machine, for each state there exists a string
such that
, let
one such string of minimal length.
(Main) Step 2: Compute Non-Equivalent States
We wish to merge states and
into same group as long as they “behave the same” on all future strings
, i.e.
for all .
If the condition above holds, we called states equivalent. If the condition does not hold, we call states
,
non-equivalent.
States and
are
-non-equivalent if it is not the case that (
).
Two states are non-equivalent iff they are -non-equivalent for some string
.
Observe that
- if
and
then
and
are
-non-equivalent
- if
and
are
-non-equivalent and we have
,
for some symbol
, then
and
are
-non-equivalent
- conversely, if
and
are
-non-equivalent and
is not an empty string, then for
the states
and
are
-non-equivalent
These observations lead to an iterative algorithm for computing non-equivalence relation
- initially put
(only final and non-final states are initially non-equivalent)
- repeat until no more changes: if
and there is
such that
, then
Step 3: Merge States that are not non-equivalent
Relation is an equivalence relation
. We define the 'factor automaton' by merging equivalent states:
- the initial state is
- relation
is a function, and we can use it to define a new deterministic automaton (there is a transition in the resulting automaton iff there is a transition between two states in the original automaton)
This is the minimal automaton.
Correctness of Constructed Automaton
Clearly, this algorithm terminates because in worst case all states become non-equivalent. We will prove below that the resulting value is the non-equivalence relation, i.e. the complement of relation given by
above.
By induction, we can easily prove that if , then
and
are non-equivalent. Similarly we can show that if
and
are
-non-equivalent for
of length
, then
by step
of the algorithm. Because the algorithm terminates, this completes the proof that
is the non-equivalence relation.
Consequently, is the equivalence relation. From the definition of this equivalence it follows that if two states are equivalent, then so is the result of applying
to them. Therefore, we have obtained a well-defined deterministic automaton.
Minimality of Constructed Automaton
Note that if two distinct states are non-equivalent, there is such that states
and
have different acceptance, so
distinguishes
and
. Now, if we take any other state machine
with
, it means that
, otherwise
would not distinguish
and
. So, if there are
pairwise non-equivalent states in
, then a minimal finite state machine for
must have at least
states. Note that if the algorithm constructs a state machine with
states, it means that
had
equivalence relations, which means that there exist
non-equivalent states. Therefore, any other deterministic machine will have at least
states, proving that the constructed machine is minimal.
Example
Construct automaton recognizing
- language {=,<=}
- language {=,<=,==}
Minimize the automaton.