These examples each consist of a single word from the 11-word vocabulary "zero" to "nine" plus "oh". Each word is repeated twice by 5 female and 5 male speakers. I have indicated a 'training' and 'test' set, so you could use this to build a very simple speech recognizer, using the first 3 speakers of each gender to set the parameters of your classifier, then test its performance on the remaining two male and/or two female speakers.

Each row in the following table corresponds to a particular speaker, with the words arranged in columns. For each word, there is an "A" repetition and a "B" repetition.

Speaker | oh | zero | one | two | three | four | five | six | seven | eight | nine | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MAE (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

MBD (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

MCB (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FAC (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FBH (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FCA (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

Speaker | oh | zero | one | two | three | four | five | six | seven | eight | nine | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MDL (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

MEH (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FDC (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FEA (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

The spoken digits are from the TIDIGITS corpus of several thousand continuous digits utterances, which also include isolated digits for each of their 55 male and 55 female training speakers.

Last updated: $Date: 2003/02/17 23:27:22 $

Dan Ellis <dpwe@ee.columbia.edu>