Baum-Welch training is similar to the Viterbi training described in the previous section except that the hard boundary implied by the function is replaced by a soft boundary function L which represents the probability of an observation being associated any given Gaussian mixture component. This occupation probability is computed from the forward and backward probabilities.
For the isolated-unit style of training, the forward probability for 1<j<N and is calculated by the forward recursion
with initial conditions given by
for 1<j<N and final condition given by
The backward probability for 1<i<N and is calculated by the backward recursion
with initial conditions given by
for 1<i<N and final condition given by
In the case of embedded training where the HMM spanning the observations is a composite constructed by concatenating Q subword models, it is assumed that at time t, the and values corresponding to the entry state and exit states of a HMM represent the forward and backward probabilities at time and , respectively, where is small. The equations for calculating and are then as follows.
For the forward probability, the initial conditions are established at time t=1 as follows
where the superscript in parentheses refers to the index of the model in the sequence of concatenated models. All unspecified values of are zero. For time t > 1,
For the backward probability, the initial conditions are set at time t=T as follows
where once again, all unspecified values are zero. For time t<T,
The total probability can be computed from either the forward or backward probabilities