One `holy grail' of automatic sound analysis is music transcription, which could be defined as trying to convert a real recording of music into an equivalent MIDI representation. As it happens, a large amount of pop music has already been converted into MIDI - not automatically, but by fans and hobbyists. These MIDI replicas are an intriguing information source for music research, as discussed in our paper on MIDI alignment.
This page brings together a few examples of actual musical recordings along with their MIDI equivalents, both as MIDI files and synthesized into audio. We also provide hand-marked labels for the major segment breaks for this music, for some of both the original and MIDI-synthesized audio.
|Track||Artist||Original audio||Transcript of original||MIDI||MIDI audio||Transcript of MIDI|
|Around the World||ATC||around_the_world-atc.wav (9M)||around_the_world-atc.phn||around_the_world-atc.mid||around_the_world-atc-midi.wav||around_the_world-atc-midi.phn|
|I ran so far away||Flock of Seagulls||i_ran_so_far_away-flock_of_seagulls.wav (13M)||i_ran_so_far_away-flock_of_seagulls.phn||i_ran_so_far_away-flock_of_seagulls.mid||i_ran_so_far_away-flock_of_seagulls-midi.wav||i_ran_so_far_away-flock_of_seagulls-midi.phn|
|Temple of Love||Sisters of Mercy||temple_of_love-sisters_of_mercy.wav (21M)||temple_of_love-sisters_of_mercy.phn||temple_of_love-sisters_of_mercy.mid||temple_of_love-sisters_of_mercy-midi.wav||(not available)|
|Beautiful Life||Ace of Base||beautiful_life-ace_of_base.wav (9.5M)||(not available)||beautiful_life-ace_of_base.mid||beautiful_life-ace_of_base-midi.wav||beautiful_life-ace_of_base-midi.phn|
|Don't Speak||No Doubt||dont_speak-no_doubt.wav (11M)||(not available)||dont_speak-no_doubt.mid||dont_speak-no_doubt-midi.wav||dont_speak-no_doubt-midi.phn|
|Mambo No. 5||Lou Bega||mambo_no_5-lou_bega.wav (9.6M)||(not available)||mambo_no_5-lou_bega.mid||mambo_no_5-lou_bega-midi.wav||mambo_no_5-lou_bega-midi.phn|
|Evangeline||Matthew Sweet||evangeline-matthew_sweet.wav (13M)||evangeline-matthew_sweet.phn||(not available)||(not available)||(not available)|
The audio files (*.wav) are in MS-WAVE format, downsampled to 22kHz mono from the original CD tracks.
The transcript files (*.phn) are text files where each line has the format
<start-sample> <end-sample> <description>
..where the samples are at 22kHz, so you convert those numbers to time in seconds by dividing them by 22050.
The track selection and MIDI file processing was done by Rob Turetsky. The manual transcriptions were created by Angel Umpierre during an internship in Summer 2003. Thanks to them both.
This material is based in part upon work supported by the National Science Foundation under Grant No. IIS-0238301. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).