next up previous contents index
Next: 3 A Tutorial Example of Using HTK Up: 2.4 Whats New in Version 2.0? Previous: 2.4 Whats New in Version 2.0?

2.4.1 Whats New in Version 2.1?

This   section lists the new features and refinements in HTK Version 2.1 compared to the preceding Version 2.0.

  1. The speech input handling has been partially re-designed and a new energy-based speech/silence detector has been incorporated into HPARM. The detector is robust yet flexible and can be configured through a number of configuration variables. Speech/silence detection can now be performed on waveform files. The calibration of speech/silence detector parameters is now accomplished by asking the user to speak an arbitrary sentence.
  2. HPARM now allows random noise signal to be added to waveform data via the configuration parameter ADDDITHER. This prevents numerical overflows which can occur with artificially created waveform data under some coding schemes.
  3. HNET has been optimised for more efficient operation when performing forced alignments of utterances using HVITE. Further network optimisations taylored to biphone/triphone-based phone recognition have also been incorporated.
  4. HVITE can now produce partial recognition hypothesis even when no tokens survive to the end of the network. This is accomplished by setting the HREC configuration parameter FORCEOUT to true.
  5. Dictionary support has been extended to allow pronunciation probabilities to be associated with different pronunciations of the same word. At the same time, HVITE now allows the use of a pronunciation scale factor during recognition.
  6. HTK now provides consistent support for reading and writing of HTK binary files (waveforms, binary MMFs, binary SLFs, HEREST accumulators) across different machine architectures incorporating automatic byte swapping. By default, all binary data files handled by the tools are now written/read in big-endian (NONVAX) byte order. The default behavior can be changed via the configuration parameters NATURALREADORDER and NATURALWRITEORDER.
  7. HWAVE supports the reading of waveforms in Microsoft WAVE file format.
  8. HAUDIO allows key-press control of live audio input.


next up previous contents index
Next: 3 A Tutorial Example of Using HTK Up: 2.4 Whats New in Version 2.0? Previous: 2.4 Whats New in Version 2.0?

ECRL HTK_V2.1: email [email protected]