MP3 reading and writing

These function, mp3read and mp3write, aim to exactly duplicate the operation of wavread and wavwrite for accessing soundfiles, except the soundfiles are in Mpeg-Audio layer 3 (MP3) compressed format. All the hard work is done by external binaries written by others: mp3info to query the format of existing mp3 files, mpg123 to decode mp3 files, and lame to encode audio files. Binaries for these files are widely available (and may be included in this distribution).

These functions were originally developed for access to very large mp3 files (i.e. many hours long), and so avoid creating the entire uncompressed audio stream if possible. mp3read allows you to specify the range of frames you want to read (as a second argument), and mp3read will construct an mpg123 command that skips blocks to decode only the part of the file that is required. This can be much quicker (and require less memory/temporary disk) than decoding the whole file.

mpg123 also provides for "on the fly" downsampling at conversion to mono, which are supported as extra options in mp3read.

For more information, including advice on handling MP4 files, see http://labrosa.ee.columbia.edu/matlab/mp3read.html

Contents

Example usage

Here, we read a wav file in, then write it out as an MP3, then read the resulting MP3 back in, and compare it to the original file.

% Read an audio waveform
[d,sr] = wavread('piano.wav');
% Save to mp3 (default settings)
mp3write(d,sr,'piano.mp3');
% Read it back again
[d2,sr] = mp3read('piano.mp3');
% mp3 encoding involves some extra padding at each end; we attempt
% to cut it off at the start, but can't do that at the end, because
% mp3read doesn't know how long the original was.  But we do, so..
% Chop it down to be the same length as the original
d2 = d2(1:length(d),:);
% What is the SNR (distortion)?
ddiff = d - d2;
disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']);
% Do they look similar?
subplot(211)
specgram(d(:,1),1024,sr);
subplot(212)
plot(1:5000,d(10000+(1:5000),1),1:5000,d2(10000+(1:5000)));
% Yes, pretty close
%
% NB: lame followed by mpg123 causes a little attenuation; you
% can get a better match by scaling up the read-back waveform:
ddiff = d - 1.052*d2;
disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']);
SNR is 21.9693 dB
SNR is 24.0399 dB

External binaries

The m files rely on three external binaries, each of which is available for Linux, Mac OS X, or Windows:

mpg123 is a high-performance mp3 decoder. Its home page is http://www.mpg123.de/ .

mp3info is a utility to read technical information on an mp3 file. Its home page is http://www.ibiblio.org/mp3info/ .

lame is an open-source MP3 encoder. Its homepage is http://lame.sourceforge.net/ .

The various authors of these packages are gratefully acknowledged for doing all the hard work to make these Matlab functions possible.

Installation

The two routines, mp3read.m and mp3write.m, will look for their binaries (mpg123 and mp3info for mp3read; lame for mp3write) in the same directory where they are installed. Binaries for different architectures are distinguished by their extension, which is the standard Matlab computer code e.g. ".mac" for Mac PPC OS X, ".glnx86" for i386-linux. The exception is Windows, where the binaries have the extension ".exe".

Temporary files will be written to (a) a directory taken from the environment variable TMPDIR (b) /tmp if it exists, or (c) the current directory. This can easily be changed by editing the m files.

% Last updated: $Date: 2007/02/04 04:12:56 $
% Dan Ellis <[email protected]>