For this part and the next part of the lab, we'll be making you implement the Forward-Backward algorithm for HMM/GMM training, but we'll have you do it in stages, to hopefully make it easier.

That is, in order to do meaningful decoding as in Part 1, we need to
have trained HMM and GMM parameters. In particular, we need to
select the following parameters: HMM arc probabilities, and
Gaussian means and variances. To be specific, our decoding vocabulary
consists of twelve words (`ONE` to `NINE`, `ZERO`,
`OH`, and silence) consisting of a total of 34 phonemes.
We use 3 HMM states for each phoneme (not including final states),
giving us 102 states total. Each of these states has two outgoing arcs
that we need to assign probabilities to, and a Gaussian that
we need to estimate.

To see the transition probabilities and Gaussian parameters used
in Part 1, look at the files `p1.tprobs` and `p1.oprobs`
in `~stanchen/e6884/lab2/`. The file `p1.tprobs` holds
the outgoing transition probabilities for each of the 102 states, the
self-loop probability followed by the exit probability. The
mapping from words to state indices is arbitrary, *e.g.*, the
word `EIGHT` is assigned states 0 through 5, and the
silence word is assigned states 99 to 101.
(See p. 156 of the Holmes to see how to interpret transition
probabilities.)
In the file
`p1.oprobs`, look for the `<gaussians>` line.
After that line, alternating means and variances are listed for the
Gaussian associated with each state's outgoing arcs.

Now, we can use the Forward-Backward algorithm to estimate
these parameters. For example, we can initialize all parameters
arbitrarily (*e.g.*, 0.5 transition probabilities, 0 means, 1 variances).
Then, we can iterate over our training data, reestimating our
parameters after each iteration such that the likelihood our model
assigns to the training data is guaranteed to increase (or at least not
decrease) over the last iteration.

What we do in each iteration is the following:

Initialize all counts to 0.

Loop through each utterance in the training data:

Use a front end to produce feature vectors.

Create an HMM by splicing together the word HMM's for each word in the reference transcript for the utterance.

Run Forward-Backward on this HMM and the acoustic feature vectors to update the counts.

Use the counts to reestimate our parameters.

Now, the Forward-Backward algorithm can be decomposed into
two parts: computing the posterior probabilities of each
arc `a` at each frame `t`, usually denoted something
like `gamma(a,t)`; and using
these `gamma(a,t)` values to update the counts
you need to update. In Part 2 of the lab, we'll have
you do the `gamma(a,t)` computation, and
we'll supply the code that uses the `gamma(a,t)`
values to update transition counts and probabilities.
(Updating transition probabilities is quite simple; just
sum and normalize.) In Part 3 of the lab, we'll have
you write the code for updating Gaussian counts, and
the code to use these counts to update Gaussian means
and variances.

In this part, you will be implementing the forward and
backward passes of the Forward-Backward algorithm to calculate
the posterior probability of each transition at
each frame, but not the statistics collection and
reestimation parts of the FB algorithm.
Your job in this part is to fill in the function
`ForwardBackwardLab2G`

in `Lab2_DP.C`.

For this lab, you will only need to calculate the
posterior counts of *arcs* and not *states*, since
both transition probabilities and observations are located
on arcs in this lab. Again, this convention agrees with
the slides from Lecture 4, but not the readings, so be careful.
More precisely, you need to calculate the values
`c _{t}(tr_{ij}|X)`
(which are another name for the

More specifically, you need to write code that fills in the forward (log)
probability `forwLogProbM` for each cell in the
dynamic programming chart. Then, we provide code to compute
the total (log) probability of the utterance, and code
that initializes the backward (log) probability `backLogProbM`
for each state for the last frame in the chart. Then,
you need to fill in code that does the rest of the backward
algorithm, and that computes the posterior counts. You
must call the routine `graph.add_count()` with these posterior
counts, which forwards these counts to do transition count updating.
Hint: read section 9.12.2 (p. 154) of the Holmes in the assigned readings.
(Pedantic note: posterior counts below 0.001 are not forwarded for
efficiency's sake.)

Your code will be compiled into the training program **TrainLab2**.
To compile this program with your code, type

smk TrainLab2 |

lab2p2a.sh |

To help with debugging, we have provided the file `lab2p2a.debug`
which contains portions of the DP chart for the first utterance
when running `lab2p2a.sh`.

Once you think you have this part working, run the script

lab2p2b.sh |

Finally, decode some (isolated digit) test data with these trained
transition probabilities (and the original
*untrained* observation probabilities)
by running the script

lab2p2c.sh |