Next: 3 A Tutorial Example of Using HTK
Up: 2.4 Whats New in Version 2.0?
Previous: 2.4 Whats New in Version 2.0?
This section lists the new features and
refinements in HTK Version 2.1 compared to the preceding Version 2.0.
- The speech input handling has been partially re-designed and a new
energy-based speech/silence detector has been incorporated into HPARM.
The detector is robust yet flexible and can be configured through a number of
configuration variables. Speech/silence detection can now be performed on
waveform files. The calibration of speech/silence detector parameters is now
accomplished by asking the user to speak an arbitrary sentence.
- HPARM now allows random noise signal to be added to waveform
data via the configuration parameter ADDDITHER. This prevents
numerical overflows which can occur with artificially created waveform data
under some coding schemes.
- HNET has been optimised for more efficient operation when
performing forced alignments of utterances using HVITE. Further
network optimisations taylored to biphone/triphone-based phone recognition
have also been incorporated.
- HVITE can now produce partial recognition hypothesis even when
no tokens survive to the end of the network. This is accomplished by setting
the HREC configuration parameter FORCEOUT to true.
- Dictionary support has been extended to allow pronunciation probabilities
to be associated with different pronunciations of the same word. At the same
time, HVITE now allows the use of a pronunciation scale factor during
recognition.
- HTK now provides consistent support for reading and writing of HTK
binary files (waveforms, binary MMFs, binary SLFs, HEREST accumulators)
across different machine architectures incorporating automatic byte swapping.
By default, all binary data files handled by the tools are now written/read in
big-endian (NONVAX) byte order. The default behavior can be changed
via the configuration parameters NATURALREADORDER and
NATURALWRITEORDER.
- HWAVE supports the reading of waveforms in Microsoft WAVE file
format.
- HAUDIO allows key-press control of live audio input.
Next: 3 A Tutorial Example of Using HTK
Up: 2.4 Whats New in Version 2.0?
Previous: 2.4 Whats New in Version 2.0?
ECRL HTK_V2.1: email [email protected]