Dan Ellis : Sound Examples :

Clean Digits

These examples each consist of a single word from the 11-word vocabulary "zero" to "nine" plus "oh". Each word is repeated twice by 5 female and 5 male speakers. I have indicated a 'training' and 'test' set, so you could use this to build a very simple speech recognizer, using the first 3 speakers of each gender to set the parameters of your classifier, then test its performance on the remaining two male and/or two female speakers.

Each row in the following table corresponds to a particular speaker, with the words arranged in columns. For each word, there is an "A" repetition and a "B" repetition.

Training examples

Speaker oh zero one two three four five six seven eight nine
MAE (male) A B A B A B A B A B A B A B A B A B A B A B
MBD (male) A B A B A B A B A B A B A B A B A B A B A B
MCB (male) A B A B A B A B A B A B A B A B A B A B A B
FAC (female) A B A B A B A B A B A B A B A B A B A B A B
FBH (female) A B A B A B A B A B A B A B A B A B A B A B
FCA (female) A B A B A B A B A B A B A B A B A B A B A B

Test examples

Speaker oh zero one two three four five six seven eight nine
MDL (male) A B A B A B A B A B A B A B A B A B A B A B
MEH (male) A B A B A B A B A B A B A B A B A B A B A B
FDC (female) A B A B A B A B A B A B A B A B A B A B A B
FEA (female) A B A B A B A B A B A B A B A B A B A B A B

Notes on data sources

The spoken digits are from the TIDIGITS corpus of several thousand continuous digits utterances, which also include isolated digits for each of their 55 male and 55 female training speakers.


Last updated: $Date: 2003/02/17 23:27:22 $

Dan Ellis <[email protected]>