These examples each consist of a single word from the 11-word vocabulary "zero" to "nine" plus "oh". Each word is repeated twice by 5 female and 5 male speakers. I have indicated a 'training' and 'test' set, so you could use this to build a very simple speech recognizer, using the first 3 speakers of each gender to set the parameters of your classifier, then test its performance on the remaining two male and/or two female speakers.

Each row in the following table corresponds to a particular speaker, with the words arranged in columns. For each word, there is an "A" repetition and a "B" repetition.

Speaker | oh | zero | one | two | three | four | five | six | seven | eight | nine | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MAE (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

MBD (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

MCB (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FAC (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FBH (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FCA (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

Speaker | oh | zero | one | two | three | four | five | six | seven | eight | nine | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MDL (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

MEH (male) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FDC (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

FEA (female) |
A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B |

The spoken digits are from the TIDIGITS corpus of several thousand continuous digits utterances, which also include isolated digits for each of their 55 male and 55 female training speakers.

