13.7.1 Function

Next: 13.7.2 Use Up: 13.7 HInit Previous: 13.7 HInit

13.7.1 Function

HINIT is used to provide initial estimates for the parameters of a single HMM using a set of observation sequences. It works by repeatedly using Viterbi alignment to segment the training observations and then recomputing the parameters by pooling the vectors in each segment. For mixture Gaussians, each vector in each segment is aligned with the component with the highest likelihood. Each cluster of vectors then determines the parameters of the associated mixture component. In the absence of an initial model, the process is started by performing a uniform segmentation of each training observation and for mixture Gaussians, the vectors in each uniform segment are clustered using a modified K-Means algorithm.

HINIT can be used to provide initial estimates of whole word models in which case the observation sequences are realisations of the corresponding vocabulary word. Alternatively, HINIT can be used to generate initial estimates of seed HMMs for phoneme-based speech recognition. In this latter case, the observation sequences will consist of segments of continuously spoken training material. HINIT will cut these out of the training data automatically by simply giving it a segment label.

In both of the above applications, HINIT normally takes as input a prototype HMM definition which defines the required HMM topology i.e. it has the form of the required HMM except that means, variances and mixture weights are ignored. The transition matrix of the prototype specifies both the allowed transitions and their initial probabilities. Transitions which are assigned zero probability will remain zero and hence denote non-allowed transitions. HINIT estimates transition probabilities by counting the number of times each state is visited during the alignment process.

HINIT supports multiple mixtures, multiple streams, parameter tying within a single model, full or diagonal covariance matrices, tied-mixture models and discrete models. The output of HInit is typically input to HRest.

Like all re-estimation tools, HINIT allows a floor to be set on each individual variance by defining a variance floor macro for each data stream (see chapter 8).

Next: 13.7.2 Use Up: 13.7 HInit Previous: 13.7 HInit

ECRL HTK_V2.1: email [email protected]