3.4.1 Step 11 - Recognising the Test Data

Next: 3.5 Running the Recogniser Live Up: 3.4 Recogniser Evaluation Previous: 3.4 Recogniser Evaluation

3.4.1 Step 11 - Recognising the Test Data

Assuming that test.scp holds a list of the coded test files, then each test file will be recognised and its transcription output to an MLF called recout.mlf by executing the following

    HVite -H hmm15/macros -H hmm15/hmmdefs -S test.scp \
          -l '*' -i recout.mlf -w wdnet \
          -p 0.0 -s 5.0 dict tiedlist

The options -p and -s set the word insertion penalty and the grammar scale factor, respectively. The word insertion penalty is a fixed value added to each token when it transits from the end of one word to the start of the next. The grammar scale factor is the amount by which the language model probability is scaled before being added to each token as it transits from the end of one word to the start of the next. These parameters can have a significant effect on recognition performance and hence, some tuning on development test data is well worthwhile.

The dictionary contains monophone transcriptions whereas the supplied HMM list contains word internal triphones. HVITE will make the necessary conversions when loading the word network wdnet. However, if the HMM list contained both monophones and context-dependent phones then HVITE would become confused. The required form of word-internal network expansion can be forced by setting the configuration variable FORCECXTEXP to true and ALLOWXWRDEXP to false (see chapter 11 for details).

Assuming that the MLF testref.mlf contains word level transcriptions for each test file, the actual performance can be determined by running HRESULTS as follows

    HResults -I testref.mlf tiedlist recout.mlf

the result would be a print-out of the form

    ====================== HTK Results Analysis ==============
      Date: Sun Oct 22 16:14:45 1995
      Ref : testrefs.mlf
      Rec : recout.mlf
    ------------------------ Overall Results -----------------
    SENT: %Correct=98.50 [H=197, S=3, N=200]
    WORD: %Corr=99.77, Acc=99.65 [H=853, D=1, S=1, I=1, N=855]
    ==========================================================

The line starting with SENT: indicates that of the 200 test utterances, 197 (98.50%) were correctly recognised. The following line starting with WORD: gives the word level statistics and indicates that of the 855 words in total, 853 (99.77%) were recognised correctly. There was 1 deletion error (D), 1 substitution error (S) and 1 insertion error (I). The accuracy figure (Acc) of 99.65% is lower than the percentage correct (Cor) because it takes account of the insertion errors which the latter ignores.

tex2html_wrap19820

Next: 3.5 Running the Recogniser Live Up: 3.4 Recogniser Evaluation Previous: 3.4 Recogniser Evaluation

ECRL HTK_V2.1: email [email protected]