Much of the functionality of HTK is built into the library modules. These modules ensure that every tool interfaces to the outside world in exactly the same way. They also provide a central resource of commonly used functions. Fig. 2.1 illustrates the software structure of a typical HTK tool and shows its input/output interfaces.
User input/output and interaction with the operating system is controlled by the library module HSHELL and all memory management is controlled by HMEM . Math support is provided by HMATH and the signal processing operations needed for speech analysis are in HSIGP . Each of the file types required by HTK has a dedicated interface module. HLABEL provides the interface for label files, HLM for language model files, HNET for networks and lattices, HDICT for dictionaries, HVQ for VQ codebooks and HMODEL for HMM definitions.
All speech input and output at the waveform level is via HWAVE and at the parameterised level via HPARM. As well as providing a consistent interface, HWAVE and HLABEL support multiple file formats allowing data to be imported from other systems. Direct audio input is supported by HAUDIO and simple interactive graphics is provided by HGRAF. HUTIL provides a number of utility routines for manipulating HMMs and HTRAIN contains support for the various HTK training tools. Finaly, HREC contains the main recognition processing functions.
As noted in the next section, fine control over the behaviour of these library modules is provided by setting configuration variables . Detailed descriptions of the functions provided by the libarary modules are given in the second part of this book and the relevant configuration variables are described as they arise. For reference purposes, a complete list is given in chapter 14.