next up previous contents index
Next: 4.8 Input/Output via Pipes and Networks Up: 4 The Operating Environment Previous: 4.6 Strings and Names

4.7 Memory Management

 

Memory management  is a very low level function and is mostly invisible to HTK users. However, some applications require very large amounts of memory. For example, building the models for a large vocabulary continuous speech dictation system might require 150Mb or more. Clearly, when memory demands become this large, a proper understanding of the impact of system design decisions on memory usage is important. The first step in this is to have a basic understanding of memory allocation in HTK.

Many HTK tools dynamically construct large and complex data structures in memory. To keep strict control over this and to reduce memory allocation overheads to an absolute minimum, HTK performs its own memory management. Thus, every time that a module or tool wishes to allocate some memory, it does so by calling routines in HMEM . At a slightly higher level, math objects such as vectors and matrices are allocated by HMATH but using the primitives provided by HMEM.

To make memory allocation  and de-allocation very fast, tools create specific memory allocators for specific objects or groups of objects. These memory allocators are divided into a sequence of blocks, and they are organised as either Stacks , M-heaps  or C-heaps . A Stack constrains the pattern of allocation and de-allocation requests to be made in a last-allocated first-deallocated order but allows objects of any size to be allocated. An M-heap allows an arbitrary pattern of allocation and de-allocation requests to be made but all allocated objects must be the same size. Both of these memory allocation disciplines are more restricted than the general mechanism supplied by the operating system, and as a result, such memory operations are faster and incur no storage overhead due to the need to maintain hidden housekeeping information in each allocated object. Finally, a C-heap uses the underlying operating system and allows arbitrary allocation patterns, and as a result incurs the associated time and space overheads. The use of C-heaps is avoided wherever possible.

Most tools provide one or more trace options which show how much memory has been allocated. The following shows the form of the output 

  ---------------------- Heap Statistics ------------------------
  nblk=1, siz= 100000*1, used= 32056, alloc= 100000 : Global Stack[S]
  nblk=1, siz=   200*28, used=   100, alloc=   5600 : cellHeap[M]
  nblk=1, siz=  10000*1, used=  3450, alloc=  10000 : mlfHeap[S]
  nblk=2, siz=   7504*1, used=  9216, alloc=  10346 : nameHeap[S]
  ---------------------------------------------------------------
Each line describes the status of each memory allocator and gives the number of blocks allocated, the current block size (number of elements in block tex2html_wrap_inline19828 the number of bytes in each element)gif, the total number of bytes in use by the tool and the total number of bytes currently allocated to that allocator. The end of each line gives the name of the allocator and its type: Stack[S], M-heap[M] or C-heap[M]. The element size for Stacks will always be 1 but will be variable in M-heaps.  The documentation for the memory intensive HTK tools indicates what each of the main memory allocators are used for and this information allows the effects of various system design choices to be monitored.


next up previous contents index
Next: 4.8 Input/Output via Pipes and Networks Up: 4 The Operating Environment Previous: 4.6 Strings and Names

ECRL HTK_V2.1: email [email protected]