In this part, we introduce the IBM FSM toolkit as a first step
in learning more about the static graph expansion process.
In particular, you will be using the program FsmOp,
which is a utility that can perform a variety of finite-state
operations on weighted FSA's and FST's. The program
FsmOp is like a calculator that operates on
FSM's rather than real values, where arguments are
input using
[reverse Polish notation]
as is used in some
[HP calculators].
For example, to produce the composition of an FSA held
in the file foo.fsm and an FST held in bar.fsm, you
would use the command
FsmOp foo.fsm bar.fsm -compose > result.fsm
|
Operations begin with the “
-” character, and include
-compose,
-determinize, and
-minimize among many others.
By default, the resulting FSM is written to standard output, but
if the last argument supplied is a filename (rather than an operation),
the resulting FSM will be written to that file instead. Thus, the command
FsmOp foo.fsm bar.fsm -compose result.fsm
|
has the same effect as the last example.
We explain the default FSA format through an example:
1 2 foo
1 2 bar
2 3 moo
3
|
Each line with three fields describes an arc in the FSM; the format
is:
src-state dst-state label. States need
not be numbered starting from 1; the label
<epsilon> is
used to represent the empty label. Each line with a single field lists
a final state. The first state mentioned in the file is the start state.
Thus, the above FSA file corresponds to:
We drew the above Postscript diagram using the following command:
FsmOp foo.fsm -draw | dot -Tps > foo.ps
|
You can use similar commands to help you visualize FSM's.
For weighted FSA's, each line can optionally be followed with
a cost, or negative log probability base 10. If a cost
is omitted on a line, it is taken to be zero. For example,
here is a weighted FSA:
1 2 foo
1 2 bar 2.4
2 3 <epsilon>
2 1.2
3
|
corresponding to
Finite-state transducers have a similar format, except
lines representing arcs have an extra field:
src-state dst-state in-label out-label [optional-cost]
|
In addition, to signal that a file holds an FST rather than an FSA,
the following line should be included at the start of the file:
Here is an example FST:
# transducer: true
1 2 ax AX
1 2 bar BAR 1.0
2 3 moo <epsilon>
2 1.2
3
|
For this part of the lab, you will have to create various FSM's
and perform operations on them, as described in lab4.txt.
Here are some hints:
Don't forget to add final states to your FSM's! Without these,
your FSM's will be equivalent to empty FSM's.
For transducers, don't forget the line “# transducer: true”!
For a list of all of the operations that FsmOp can perform,
run FsmOp with no arguments.
To prepare for this part, create the relevant subdirectory
and copy over the needed files:
mkdir -p ~/e6884/lab4/
cd ~/e6884/lab4/
cp ~stanchen/e6884/lab4/copy/* .
cp ~stanchen/e6884/lab4/.mk_chain .
|
This will also copy over all files needed in later parts of the lab.