4. Part 3: Implement Gaussian training within the Forward-Backward algorithm

In this part of the lab, you will implement the statistics collection needed for reestimating a Gaussian distribution as well as the actual reestimation algorithm. This will involve filling in the functions UpdateCountsRefR and ReestimateS in Lab2_AM.C.

More specifically, at the beginning of each iteration through the training data, all of the Gaussian counts will be initialized to zero. Then, for each utterance, all of your calls to graph.add_count() during FB will now pass these posterior counts to UpdateCountsRefR to let you update Gaussian statistics. At the end of each iteration through the training data, the routine ReestimateS will be called for each Gaussian to let you reestimate these parameters from the counts you collected. As far as where to find the update equations, you can look at equations (9.50) and (9.51) on p. 152 of Holmes or equations (8.55) and (8.56) on p. 396 of HAH. Hint: remember to use the new mean when reestimating the variance. Hint: remember that we are using diagonal-covariance Gaussians, so you only need to reestimate covariances along the diagonal of the covariance matrix.

Your code will be compiled into the training program TrainLab2. To compile this program with your code, type
smk TrainLab2
To run this trainer on some utterances representing isolated digits, run
This script just collects counts for training HMM observation probabilities from the same mini-training set as before, reestimates Gaussian parameters, and outputs them to the file p3a.oprobs. The “correct” output can be found in the file p3a.oprobs in ~stanchen/e6884/lab2/; only look at the counts after the line “<gaussians>”. Again, it's OK if your output doesn't match exactly, but it should be quite close.

Once you think you have this part working, run the script
This script reestimates observation probabilities on the training set (while leaving transition probabilities unchanged), performing twenty iterations of the forward-backward algorithm and outputting the average logprob/frame (as computed in the forward algorithm) at each iteration. If your implementation of Gaussian reestimation is correct, this logprob should always be increasing and should look like it's converging. This script will output trained Gaussian parameters to the file p3b.oprobs. (Debugging hint: things should still converge if you update only means and not variances; by commenting out your variance update, you can see if your mean update looks like it's working.)

Decode the same test data as in the last part with this trained observation model (and untrained transition model) by running the script
By comparing the word-error rate found here with that found in the corresponding run in the last part, we can see the relative importance of transition and observation probabilities. (BTW, these error rates will be very poor since the training set is very small; there are some digits that it does not contain an instance of.)

For further evidence, run the script:
This script starts with the observation model in p3b.oprobs and trains both the observation and transition probabilities on the given training set for five iterations, creating the files p3d.oprobs and p3d.tprobs. It then decodes the same test set as before with these new models.