next up previous contents index
Next: 8.7.2 Forward/Backward Probabilities Up: 8.7 Parameter Re-Estimation Formulae Previous: 8.7 Parameter Re-Estimation Formulae

8.7.1 Viterbi Training (HINIT)

  In this style of model training, a set of training observations tex2html_wrap_inline21696 is used to estimate the parameters of a single HMM by iteratively computing Viterbi alignments. When used to initialise a new HMM, the Viterbi segmentation is replaced by a uniform segmentation (i.e. each training observation is divided into N equal segments) for the first iteration.

Apart from the first iteration on a new model, each training sequence tex2html_wrap_inline21700 is segmented using a state alignment procedure which results from maximising

displaymath21680

for 1<i<N where

displaymath21681

with initial conditions given by

displaymath21682

displaymath21683

for 1<j<N. In this and all subsequent cases, the output probability tex2html_wrap_inline21710 is as defined in equations 7.1 and 7.2 in section 7.1.

If tex2html_wrap_inline21712 represents the total number of transitions from state i to state j in performing the above maximisations, then the transition probabilities can be estimated from the relative frequencies

displaymath21684

The sequence of states which maximises tex2html_wrap_inline21718 implies an alignment of training data observations with states. Within each state, a further alignment of observations to mixture components is made. The tool HINIT provides two mechanisms for this: for each state and each stream

  1. use clustering to allocate each observation tex2html_wrap_inline21720 to one of tex2html_wrap_inline21722 clusters, or
  2. associate each observation tex2html_wrap_inline21724 with the mixture component with the highest probability
In either case, the net result is that every observation is associated with a single unique mixture component. This association can be represented by the indicator function tex2html_wrap_inline21726 which is 1 if tex2html_wrap_inline21728 is associated with mixture component m of stream s of state j and is zero otherwise.

The means and variances are then estimated via simple averages

displaymath21685

displaymath21686

Finally, the mixture weights are based on the number of observations allocated to each component

displaymath21687


next up previous contents index
Next: 8.7.2 Forward/Backward Probabilities Up: 8.7 Parameter Re-Estimation Formulae Previous: 8.7 Parameter Re-Estimation Formulae

ECRL HTK_V2.1: email [email protected]