This block of examples consists of complete spoken sentences, but again each example has been subjected to a range of modifications (the same as were applied to the digits examples): It has been filtered by four different channel characteristics (f1..f4); it has had three different kinds of noise (n0, n1, n2) added at two levels (nXL, nXH), and it has had two reverberation characteristics (r1, r2) added also at two direct-to-reverberant levels (rXL, rXH).
Each row in the following table corresponds to a single base sample, with the differently corrupted versions arrayed across the columns.
Base utterance | Filtered | Noisy | Reverb | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sentence-female-1 | f1 | f2 | f3 | f4 | n0L | n0H | n1L | n1H | n2L | n2H | r1L | r1H | r2L | r2H |
sentence-female-2 | f1 | f2 | f3 | f4 | n0L | n0H | n1L | n1H | n2L | n2H | r1L | r1H | r2L | r2H |
sentence-female-3 | f1 | f2 | f3 | f4 | n0L | n0H | n1L | n1H | n2L | n2H | r1L | r1H | r2L | r2H |
sentence-male-1 | f1 | f2 | f3 | f4 | n0L | n0H | n1L | n1H | n2L | n2H | r1L | r1H | r2L | r2H |
sentence-male-2 | f1 | f2 | f3 | f4 | n0L | n0H | n1L | n1H | n2L | n2H | r1L | r1H | r2L | r2H |
sentence-male-3 | f1 | f2 | f3 | f4 | n0L | n0H | n1L | n1H | n2L | n2H | r1L | r1H | r2L | r2H |
The 'sentences' are from the TIMIT corpus, one of the earliest standard speech recognition corpora, which collected a wide range of different material covering different phonetic contexts, accents, etc. The background noises are excerpted from the Aurora noisy digits evaluation, and the reverberation characteristics are from a couple of random sources.