This package contains a set of functions for calling interfacing with
HTK from Matlab.  Right now its mostly limited to training GMMs and
HMMs.  It converts your Matlab data into a format that HTK understands
and calls HTK command line programs.  The path to the HTK binaries is
hardcoded in get_htk_path.m

Unforuntately this has not been very thoroughly tested, but it has
been working for me.  If anyone wants to write some unit tests using
http://mlunit.sf.net or something similar I won't say no.


* Functions

They are all reasonably commented...

 - The important ones:
   - train_hmm_htk        - use HTK to train an HMM
   - train_gmm_htk        - use HTK to train a GMM
   - train_htk_recognizer - train a simple HTK recognizer
   - eval_htk_recognizer  - recognize signal using HTK recognizer

 - Utility functions:
   - read_htk_hmm     - read in an HTK HMM definition (text format only)
   - write_htk_hmm    - write an HMM structure as an HTK HMM definition
   - htkread/htkwrite - read/write a Matlab matrix in HTK feature file format
                        (written by Mark Hasegawa-Johnson)
   - compose_hmms     - does FSM composition to form a big HMM from many
                        small HMMs (i.e. get an HMM to recognize a
                        sentence from a bunch of phone HMMs).
   - kmeans           - uses k-means to learn clusters from data.  Used to
                        initialize GMMs and HMMs.
   - logsum           - takes the sum of a matrix of log likelihoods
   - get_htk_path     - centralized location to set the path to the HTK binaries
 
* Data Structures

The functions in this toolbox pass around the following structures:
Note: all probabilities are stored as log probabilities

** GMM
  - gmm.nmix   - number of components in the mixture
  - gmm.priors - array of prior log probabilities over each state
  - gmm.means  - matrix of means (column x is mean of component x)
  - gmm.covars - matrix of covariance (column x is the diagonal of the
                 covariance matrix of component x)

** HMM with GMM observations
  - hmm.name          - 
  - hmm.nstates       - number of states in the HMM
  - hmm.emission_type - 'GMM'
  - hmm.start_prob    - array of log probs P(first observation is state x)
  - hmm.end_prob      - array of log probs P(last observation is state x)
  - hmm.transmat      - matrix of transition log probs (transmat(x,y) 
                        = log(P(transition from state x to state y)))
  - hmm.labels        - optional cell array of labels for each state in the HMM
                        (for use in composing HMMs)
  - hmm.gmms          - array of GMM structures 

** HMM with Gaussian observations
  - hmm.nstates       - number of states in the HMM
  - hmm.emission_type - 'gaussian'
  - hmm.start_prob    - array of log probs P(first observation is state x)
  - hmm.end_prob      - array of log probs P(last observation is state x)
  - hmm.transmat      - matrix of transition log probs (transmat(x,y) 
                        = log(P(transition from state x to state y)))
  - hmm.labels        - optional cell array of labels for each state in the HMM
                        (for use in composing HMMs)
  - hmm.means         - matrix of means (column x is mean of state x)
  - hmm.covars        - matrix of means (column x is the diagonal of the
                        covariance matrix of component x)

Note that each row of exp(hmm.transmat) does not necessarily sum to 1
because for each state x there is some probability
(exp(hmm.exit_prob(x))) that the next transition will be to a
non-emitting exit state (i.e. the current observation is the last
observation in the sequence).  The correct invariant is:
sum(exp(hmm.transmat, 2)) + exp(hmm.exit_prob) == ones(hmm.nstates, 1)