next up previous contents index
Next: 3.1.2 Step 2 - the Dictionary Up: 3.1 Data Preparation Previous: 3.1 Data Preparation

3.1.1 Step 1 - the Task Grammar

The goal of the system to be built here is to provide a voice-operated interface for phone dialling. Thus, the recogniser must handle digit strings and also personal name lists. Examples of typical inputs might be

Dial three three two six five four

Dial nine zero four one oh nine

Phone Woodland

Call Steve Young

HTK provides a grammar definition language for specifying simple task grammars  such as this. It consists of a set of variable definitions followed by a regular expression describing the words to recognise. For the voice dialling application, a suitable grammar might be

    $digit = ONE | TWO | THREE | FOUR | FIVE |
             SIX | SEVEN | EIGHT | NINE | OH | ZERO;
    $name  = [ JOOP ] JANSEN |
             [ JULIAN ] ODELL |
             [ DAVE ] OLLASON |
             [ PHIL ] WOODLAND | 
             [ STEVE ] YOUNG;
    ( SENT-START ( DIAL <$digit> | (PHONE|CALL) $name) SENT-END )
where the vertical bars denote alternatives, the square brackets denote optional items and the angle braces denote one or more repetitions. The complete grammar can be depicted as a network as shown in Fig. 3.1.

  tex2html_wrap19788

  tex2html_wrap19790

The above high level representation of a task grammar is provided for user convenience. The HTK recogniser actually requires a word network to be defined using a low level notation called HTK Standard Lattice Format  (SLF)   in which each word instance and each word-to-word transition is listed explicitly. This word network can be created automatically from the grammar above using the HPARSE tool, thus assuming that the file gram contains the above grammar, executing

 

    HParse gram wdnet
will create an equivalent word network in the file wdnet (see Fig 3.2).


next up previous contents index
Next: 3.1.2 Step 2 - the Dictionary Up: 3.1 Data Preparation Previous: 3.1 Data Preparation

ECRL HTK_V2.1: email [email protected]