Next: 8 HMM Parameter Estimation Up: 7 HMM Definition Files Previous: 7.8 Binary Storage Format

7.9 The HMM Definition Language

To conclude this chapter, this section presents a formal description of the HMM definition language used by HTK. Syntax is described using an extended BNF notation in which alternatives are separated by a vertical bar |, parentheses () denote factoring, brackets [ ] denote options, and braces {} denote zero or more repetitions.

All keywords are enclosed in angle brackets and the case of the keyword name is not significant. White space is not significant except within double-quoted strings.

The top level structure of a HMM definition is shown by the following rule.

 
		  hmmdef = 		 [   h macro ]

				         <BeginHMM>

						          [ globalOpts ]

						          <NumStates> short

						          state { state }

						          transP

						          [ duration ]

				         <EndHMM>

A HMM definition consists of an optional set of global options followed by the <NumStates> keyword whose following argument specifies the number of states in the model inclusive of the non-emitting entry and exit states

. The information for each state is then given in turn, followed by the parameters of the transition matrix and the model duration parameters, if any. The name of the HMM is given by the

h macro. If the HMM is the only definition within a file, the

h macro name can be omitted and the HMM name is assumed to be the same as the file name.

The global options are common to all HMMs. They can be given separately using a o option macro

 
		 optmacro = 		   o globalOpts

or they can be included in one or more HMM definitions. Global options may be repeated but no definition can change a previous definition. All global options must be defined before any other macro definition is processed. In practice this means that any HMM system which uses parameter tying must have a o option macro at the head of the first macro file processed.

The full set of global options is given below. Every HMM set must define the vector size (via <VecSize> ), the stream widths (via <StreamInfo> ) and the observation parameter kind. However, if only the stream widths are given then the vector size will be inferred. If only the vector size is given, then a single stream of identical width will be assumed. All other options default to null.

 
		 globalOpts = 		 option { option }

		  option =		  <StreamInfo> short { short } |

				   <VecSize>    short |

				   covkind |

				   durkind |

				   parmkind

The arguments to the <StreamInfo> option are the number of streams (default 1) and then for each stream, the width of that stream. The <VecSize> option gives the total number of elements in each input vector. If both <VecSize> and <StreamInfo> are included then the sum of all the stream widths must equal the input vector size.

The covkind defines the kind of the covariance matrix

 
		  covkind =		 <DiagC> | <InvDiagC> | <FullC> |

				            <LLTC> | <XformC>

where <InvDiagC> is used internally. <LLTC> and <XformC> are not used in HTK Version 2.0. Setting the covariance kind as a global option forces all components to have this kind. In particular, it prevents mixing full and diagonal covariances within a HMM set.

The durkind denotes the type of duration model used according to the following rules

 
		  durkind =		 <nullD> | <poissonD> | <gammaD> | <genD>

For anything other than <nullD>, a duration vector must be supplied for the model or each state as described below.

The parameter kind is any legal parameter kind including qualified forms (see section 5.1)

 
		  parmkind =		 <basekind{_D|_A|_E|_N|_Z|_O}>

		  basekind =		 <discrete>|<lpc>|<lpcepstra>|<mfcc> | <fbank> |

 		 		          <melspec>| <lprefc>|<lpdelcep> | <user>

where the syntax rule for parmkind is non-standard in that no spaces are allowed between the base kind and any subsequent qualifiers. As noted in chapter 5, <lpdelcep> is provided only for compatibility with earlier versions of HTK and its further use should be avoided.

Each state of each HMM must have its own section defining the parameters associated with that state

 
		 state =		  <State> short stateinfo

where the short following <State> is the state number. State information can be defined in any order. The syntax is as follows

 
		   stateinfo = 		   s macro |

  				              [ mixes ] [ weights ] stream { stream } [ duration ]

		   macro     = 		 string

A stateinfo definition consists of an optional specification of the number of mixtures, an optional set of stream weights, followed by a block of information for each stream, optionally terminated with a duration vector. Alternatively, s macro can be written where macro is the name of a previously defined macro.

The optional mixes in a stateinfo definition specify the number of mixture components (or discrete codebook size) for each stream of that state

 
		   mixes = 		  <NumMixes> short {short}

where there should be one short for each stream. If this specification is omitted, it is assumed that all streams have just one mixture component.

The optional weights in a stateinfo definition define a set of exponent weights for each independent data stream. The syntax is

 
		   weights = 		   w macro | <SWeights> short vector

		   vector  = 		 float { float }

where the short gives the number S of weights (which should match the value given in the <StreamInfo> option) and the vector contains the S stream weights (see section 7.1).

The definition of each stream depends on the kind of HMM set. In the normal case, it consists of a sequence of mixture component definitions optionally preceded by the stream number. If the stream number is omitted then it is assumed to be 1. For tied-mixture and discrete HMM sets, special forms are used.

 
		   stream = 		 [ <Stream> short ]

		            		 (mixture { mixture } | tmixpdf | discpdf)

The definition of each mixture component consists of a Gaussian pdf optionally preceded by the mixture number and its weight

 
		   mixture = 		 [ <Mixture> short float ] mixpdf

If the <Mixture> part is missing then mixture 1 is assumed and the weight defaults to 1.0.

The tmixpdf option is used only for fully tied mixture sets. Since the mixpdf parts are all macros in a tied mixture system and since they are identical for every stream and state, it is only necessary to know the mixture weights. The tmixpdf syntax allows these to be specified in the following compact form

 
		   tmixpdf = 		 <TMix> macro weightList

		   weightList = 		 repShort { repShort }

		   repShort = 		 short [    char ]

where each short is a mixture component weight scaled so that a weight of 1.0 is represented by the integer 32767. The optional asterix followed by a char is used to indicate a repeat count. For example, 0*5 is equivalent to 5 zeroes. The Gaussians which make-up the pool of tied-mixtures are defined using m macros called macro1, macro2, macro3, etc.

Discrete probability HMMs are defined in a similar way

 
		   discpdf = 		 <DProb> weightList

The only difference is that the weights in the weightList are scaled log probabilities as defined in section 7.6.

The definition of a Gaussian pdf requires the mean vector to be given and one of the possible forms of covariance

 
		   mixpdf = 		   m macro | mean cov [ <GConst> float ]

		   mean = 		   u macro | <Mean> short vector

		   cov =  		 var | inv | xform

		   var = 		   v macro | <Variance> short vector

		   inv = 		   i macro |

		        		 (<InvCovar> | <LLTCovar>) short tmatrix

		   xform = 		   x macro | <Xform> short short matrix

		   matrix = 		 float {float}

		   tmatrix = 		 matrix

In mean and var, the short preceding the vector defines the length of the vector, in inv the short preceding the tmatrix gives the size of this square upper triangular matrix, and in xform the two short's preceding the matrix give the number of rows and columns. The optional <GConst> gives that part of the log probability of a Gaussian that can be precomputed. If it is omitted, then it will be computed during load-in, including it simply saves some time. HTK tools which output HMM definitions always include this field.

In addition to defining the output distributions, a state can have a duration probability distribution defined for it.

 
		   duration = 		   d macro | <Duration> short vector

Alternatively, as shown by the top level syntax for a hmmdef, duration parameters can be specified for a whole model.

Finally, the transition matrix is defined by

 
		   transP = 		   t macro | <TransP> short matrix

where the short in this case should be equal to the number of states in the model.

Next: 8 HMM Parameter Estimation Up: 7 HMM Definition Files Previous: 7.8 Binary Storage Format

ECRL HTK_V2.1: email [email protected]