13.18.2 Use

Next: 13.18.3 Tracing Up: 13.18 HVite Previous: 13.18.1 Function

13.18.2 Use

HVITE is invoked via the command line

   HVite [options] dictFile hmmList testFiles ...

HVite will then either load a single network file and match this against each of the test files -w netFile, or create a new network for each test file either from the corresponding label file -a or from a word lattice -w. When a new network is created for each test file the path name of the label (or lattice) file to load is determined from the test file name and the -L and -X options described below.

If no testFiles are specfied the -w s option must be specified and recognition will be performed from direct audio.

The hmmList should contain a list of the models required to construct the network from the word level representation.

The recogniser output is written in the form of a label file whose path name is determined from the test file name and the -l and -x options described below. The list of test files can be stored in a script file if required.

When performing N-best recognition (see -n N option described below) the output label file can contain multiple alternatives -n N M and a lattice file containing multiple hypotheses can be produced.

The detailed operation of HVITE is controlled by the following command line options

-a

Perform alignment. HVITE will load a label file and create an alignment network for each test file.

-b s

Use s as the sentence boundary during alignment.

-c f

Set the tied-mixture observation pruning threshold to f. When all mixtures of all models are tied to create a full tied-mixture system, the calculation of output probabilities is treated as a special case. Only those mixture component probabilities which fall within f of the maximum mixture component probability are used in calculating the state output probabilities (default 10.0).

-d dir

This specifies the directory to search for the HMM definition files corresponding to the labels used in the recognition network.

-e

When using direct audio input, output transcriptions are not normally saved. When this option is set, each output transcription is written to a file called PnS where n is an integer which increments with each output file, P and S are strings which are by default empty but can be set using the configuration variables RECOUTPREFIX and RECOUTSUFFIX.

-f

During recognition keep track of full state alignment.

-g

When using direct audio input, this option enables audio replay of each input utterance after it has been recognised.

-i s

Output transcriptions to MLF s.

-l dir

This specifies the directory to store the output label files. If this option is not used then HVITE will store the label files in the same directory as the data. When output is directed to an MLF, this option can be used to add a path to each output file name. In particular, setting the option -l '*' will cause a label file named xxx to be prefixed by the pattern "*/xxx" in the output MLF file. This is useful for generating MLFs which are independent of the location of the corresponding data files.

-m

During recognition keep track of model boundaries.

-n i [N]

Use i tokens in each state to perform N-best recognition. The number of alternative output hypothese n defaults to 1.

-o s

Choose how the output labels should be formatted. s is a string with certain letters (from NSCTWM) indicating binary flags that control formatting options. N normalise acoustic scores by dividing by the duration (in frames) of the segment. S remove scores from output label. By default scores will be set to the total likelihood of the segment. C Set the transcription labels to start and end on frame centres. By default start times are set to the start time of the frame and end times are set to the end time of the frame. T Do not include times in output label files. W Do not include words in output label files when performing state or model alignment. M Do not include model names in output label files when performings state and model alignment.

-p f

Set the word insertion log probability to f (default 0.0).

-q s

Choose how the output lattice should be formatted. s is a string with certain letters (from ABtvaldmn) indicating binary flags that control formatting options. A attach word labels to arcs rather than nodes. B output lattices in binary for speed. t output node times. v output pronunciation information. a output acoustic likelihoods. l output language model likelihoods. d output word alignments (if available). m output within word alignment durations. n output within word alignment likelihoods.

-s f

Sets the grammar scale factor to real f. This factor post-multiplies the language model likelihoods from the word lattices. (default value 1.0).

-t f

Enable beam searching such that any model whose maximum log probability token falls more than f below the maximum for all models is deactivated. Setting f to 0.0 disables the beam search mechanism (default value 0.0).

-u i

Set the maximum number of active models to i. Setting i to 0 disables this limit (default 0).

-v f

Enable word end pruning. Do not propagate tokens from word end nodes that fall more than f below the maximum word end likelihood. (default 0.0).

-w [s]

Perform recognition from word level networks. If s is included then use it to define the network used for every file.

-x ext

This sets the extension to use for HMM definition files to ext.

-y ext

This sets the extension for output label files to ext (default rec).

-z ext

Enable output of lattices (if performing NBest recognition) with extension ext (default off).

-L dir

This specifies the directory to find input label (when -a is specified) or network files (when -w is specified).

-X s

Set the extension for the input label or network files to be s (default value lab).

-F fmt

Set the source data format to fmt.

-G fmt

Set the label file format to fmt.

-H mmf

Load HMM macro model file mmf. This option may be repeated to load multiple MMFs.

-I mlf

This loads the master label file mlf. This option may be repeated to load several MLFs.

-P fmt

Set the target label format to fmt.

HVITE also supports the standard options -A, -C, -D, -S, -T, and -V as described in section 4.4.

Next: 13.18.3 Tracing Up: 13.18 HVite Previous: 13.18.1 Function

ECRL HTK_V2.1: email [email protected]