next up previous contents index
Next: 13.12.3 Use Up: 13.12 HQuant Previous: 13.12.1 Function

13.12.2 VQ Codebook Format

Externally, a VQ table is stored in a text file consisting of a header followed by a sequence of entries representing each tree node. One tree is built for each stream and linear codebooks are represented by a tree in which there are only right branches.

The header consists of a magic number followed by the covariance kind, the number of following nodes, the number of streams and the width of each stream.

 
		  header =		 magic type covkind numNodes numS swidth1 swidth2  ...

where magic is a magic number which is usually the code for the parameter kind of the data. The type defines the type of codebook

 
		  type =		 linear (0) , binary tree-structured (1)

The covariance kind determines the type of distance metric to be used

 
		  covkind =		 diagonal covariance (1), full covariance (2), euclidean (5)

Within the file, these covariances are stored in inverse form.

Each node entry has the following form

 
		  node-entry =		 stream vqidx nodeId leftId rightId

mean-vector

[inverse-covariance-matrix | inverse-variance-vector]

Stream is the stream index for this entry. Vqidx is the VQ index corresponding to this entry. This is the number that appears in vector quantised speech files. In tree-structured code-books, it is zero for non-terminal nodes. Every node has a unique integer identifier (distinct from the VQ index) given by nodeId. The left and right daughter of the current node are given by leftId and rightId. In a linear codebook, the left identifier is always zero.

Some examples of VQ tables are given in Chapter 10.


next up previous contents index
Next: 13.12.3 Use Up: 13.12 HQuant Previous: 13.12.1 Function

ECRL HTK_V2.1: email [email protected]