next up previous contents index
Next: 13.11.4 Use Up: 13.11 HParse Previous: 13.11.2 Network Definition

13.11.3 Compatibility Mode

In HPARSE compatibility mode, the interpretation of the ENBF network is that used by the HTK V1.5 HVITE program. in which HPARSE ENBF notation was used to define both the word level syntax and the dictionary. Compatibility mode is aimed at converting files written for HTK V1.5 into their equivalent HTK V2 representation. Therefore HPARSE will output the word level portion of such a ENBF syntax as an HTK V2 lattice file and the pronunciation information is optionally stored in an HTK V2 dictionary file. When operating in compatibility mode and not generating dictionary output, the pronunciation information is discarded.

In compatibility mode, the reserved node names WD_BEGIN and WD_END are used to delimit word boundaries--nodes between a WD_BEGIN/WD_END pair are called ``word-internal'' while all other nodes are ``word-external''. All WD_BEGIN/WD_END nodes must have an ``external name'' attached that denotes the word. It is a requirement that the number of WD_BEGIN and the number of WD_END nodes are equal and furthermore that there isn't a direct connection from a WD_BEGIN node to a WD_END. For example a portion of such an HTK V1.5 network could be

     $A        =  WD_BEGIN%A ax WD_END%A;
     $ABDOMEN  =  WD_BEGIN%ABDOMEN ae b d ax m ax n WD_END%ABDOMEN;
     $ABIDES   =  WD_BEGIN%ABIDES ax b ay d z WD_END%ABIDES;
     $ABOLISH  =  WD_BEGIN%ABOLISH ax b aa l ih sh WD_END%ABOLISH;
      ... etc


     ( < 
        $A | $ABDOMEN | $ABIDES | $ABOLISH | ... etc
     > )
HPARSE will output the connectivity of the words in an HTK V2 word lattice format file and the pronunciation information in an HTK V2 dictionary. Word-external nodes are treated as words and stored in the lattice with corresponding entries in the dictionary.

It should be noted that in HTK V1.5 any ENBF network could appear between a WD_BEGIN/WD_END pair, which includes loops. Care should therefore be taken with syntaxes that define very complex sets of alternative pronunciations. It should also be noted that each dictionary entry is limited in length to 100 phones. If multiple instances of the same word are found in the expanded HParse network, a dictionary entry will be created for only the first instance and subsequent instances are ignored (a warning is printed). If words with a NULL external name are present then the dictionary will contain a NULL output symbol.

Finally, since the implementation of the generation of the HPARSE network has been revisedgif the semantics of variable definition and use has been slightly changed. Previously variables could be redefined during network definition and each use would follow the most recent definition. In HTK V2 only the final definition of any variable is used in network expansion.


next up previous contents index
Next: 13.11.4 Use Up: 13.11 HParse Previous: 13.11.2 Network Definition

ECRL HTK_V2.1: email [email protected]