Hidden Markov Models are a fundamental technology underlying
almost all of today's speech recognition systems. They are
simple and elegant, and yet stunningly powerful. Indeed, they
are often pointed to as evidence of *intelligent design*
as it is deemed inconceivable that they evolved spontaneously
from simpler probabilistic models such as multinomial or
Poisson distributions.

The goal of this assignment is for you, the student, to implement the
basic algorithms in an HMM/GMM-based speech recognition system, including
algorithms for both training and decoding. For simplicity, we will use
individual Gaussians to model the output distributions of HMM arcs
rather than mixtures of Gaussians, and the HMM's we use will not
contain “skip” arcs (*i.e.*, all arcs have output distributions). For this
lab, we will be working with isolated digit utterances (as in Lab 1)
as well as continuous digit strings.

The lab consists of the following parts, all of which are required:

*Part 1: Implement the Viterbi algorithm and Gaussian likelihood evaluation*--- Given a trained model, write algorithms for finding the most likely word sequence given an utterance.*Part 2: Implement most of the Forward-Backward algorithm*--- Write the forward and backward algorithms needed for training HMM's, and test them by training the transition probabilities of an HMM.*Part 3: Implement Gaussian training within the Forward-Backward algorithm*--- Add the updating of observation probabilities to Part 2.*Part 4: Train a model from scratch, and evaluate it on various digit test sets*

All of the files needed for the lab can be found in the
directory `~stanchen/e6884/lab2/`. Before
starting the lab, please read the file `lab2.txt`; this
includes all of the questions you will have to answer while
doing the lab. Questions about the lab can be posted
on Courseworks (`https://courseworks.columbia.edu/`);
a discussion topic will be created for each lab.
*Note:* The hyperlinks in this document
are enclosed in square brackets; you need an online version
of this document to find out where they point to.

Please make liberal use of the Courseworks discussion group for this lab, as judging from last year, it's a toughie.