Dan Ellis :

Sound Examples

This directory contains a collection of small sound example sets. These can be useful e.g. as the basis for student projects. Currently we have:

Distorted Digit Strings: A set of spoken digit strings, subject to a range of channel distortions and added noises (derived from TIDIGITS).
Clean Digits: Digits 0-9 and "oh" for 10 speakers, recorded in quiet (also derived from TIDIGITS).
Sentences: A few complete sentences by a few different speakers, subject to different channel and noise distortions.
Meeting Recorder: A small excerpt from a corpus of recordings of real meetings, multi-speaker and multi-microphone (from the ICSI Meeting Recorder project).
ShATR: Excerpt from ShATR, another multi-speaker, multi-microphone corpus collected at ATR (Japan) by researchers visiting from Sheffield (more at the ShATR homepage).
Music/Speech: Some short recordings from the radio composed of both music and speech and mixtures of the two (derived from the corpus collected by Scheirer and Slaney).
DTMF: Telephone "touch-tone" sequences recorded under various non-ideal conditions (self-recorded).
Alarms: A small collection of alarm sounds, such as phone rings and sirens (collected from the web and self-recorded).
Musical instruments: Isolated notes played by a few different orchestral instruments (from the McGill Master Samples).
Noises: A range of different examples of background noise (from the Aurora database distribution, and perhaps elsewhere).
Music - original and sythetic: A small collection of real music along with matched MIDI "replicas", and also manual markings of major segment boundaries.

Handling sounds in Matlab

After downloading them to your local machine, you can manipulate these sounds in Matlab, as shown in the following transcript:

>> % Read in the sound data >> [d,r] = wavread('msmn1.wav'); >> % r is the sampling rate >> r r = 22050 >> % d is the data >> size(d) ans = 110250 1 >> % i.e. 110250 samples = 5 seconds * 22050 samples/sec >> % Listen to it >> soundsc(d,r); >> % Look at the spectrogram (spectrum as a function of time) >> specgram(d,1024,r); >> % Design a quick high-pass filter at 1000 Hz (relative to nyquist rate r/2) >> [b,a] = ellip(8,1,50,1000/(r/2),'high'); >> % Pass it through the filter >> df = filter(b,a,d); >> % See how the spectrogram is changed >> specgram(df,1024,r); >> % Most of the energy below 1000 Hz has been removed >> % Take a listen >> soundsc(df,r); >> % .. all the 'bass' is gone >> % Write it out to a new soundfile >> wavwrite(df,r,'tmp.wav');

Other resources

Here are some links to other interesting sound collections available on the web:

Phoneme examples in Klatt analysis format, from the MIT Open Courseworks site for 6.452 Speech Lab.

Acknowledgment

This material is based in part upon work supported by the National Science Foundation under Grant No. IIS-0238301. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

Last updated: $Date: 2003/02/17 23:28:08 $

Dan Ellis <[email protected]>