E4896: Music Signal Processing

Syllabus

For an overall outline of the topics covered in the course, see below.

The following is the detailed syllabus, outlining the actual lectures as they they were delivered in Spring 2004.

Links to additional material distributed in the lecture, as well as sound files, are included below as well. Note that the download area of this web site is password protected. Please contact the instructor if you need access.

Reference books are cited with abbreviations. For example, "TR.62-64, means 'Total Recording" by D. Moulton, pages 62-64, MDFT.Preface corresponds to an online citation of the Preface in the "Mathematics of the DFT' book by J. O. Smith III. Please see the Reading page for the definition of these abbreviations. You do not need to review the material on the reference books, but citations are provided for those who want to read further on the various subjects.

Week 1

January 20

Audio and acoustics in open air and enclosed rooms. Definition of SPL, dB SPL, dependence of sound pressure level on distance, atmospheric absorption, relative humidity, and temperature. Sensitivity in frequency and time - the "audio window". Perception of pitch and frequency. Fletcher-Munson equal loudness curves, SPL metering with A, B, and C weighting curves. Room acoustics: direct and reflected sound, 7-reflection model, definition of critical distance, RT60, absorption properties of different materials, modes and their calculation. Desirable characteristics of spaces used for critical listening.

The lecture covered material from TR.25-108. The physical properties of sound as they relate to temperature, relative humidity, and distance are covered in detail in SSE.145-150. Students should also review Part I of the TR book, on the "Modern Music Recording Process", which covers several operational and artistic aspects of sound engineering.

Supporting material

Download slides.

Read the two-part article on the career and accomplishments of audio engineering and sound recording pioneer Bill Putnam on the Mix Magazine web site: part 1 (October 2003), part 2 (November 2003).

Week 2

January 27

Definition of additional, electronics-oriented dB metrics used in audio engineering: dBW, dBm, dBu, dBV, dBv, and dBFS. (Although most of this is covered in TR. 51-52, I also used the material from SRH.19-28. A good summary is also available in MDFT.Decibels.) Input/output impedance considerations. Discussion of consumer- and professional-grade analog audio connections , including balanced and unbalanced (RCA/phono, 1/4" phone TS and TRS, 1/8" TS and TRS phone/miniplug, XLR). Brief summary of proper grounding considerations (see SRH.327-345 for more details).

Example of gain analysis of a complete system (same example as in TR.53). Gain structure analysis and definition of electronic and room noise floor, signal-to-noise ratio, headroom, and dynamic range. We covered also TR.81-96, although not in detail. In general, the subject of psychoacoustics is is fascinating but also involved, so we cannot spend too much time on it in our lectures. I will focus on the most important properties, which will be introduced when needed.

We also did a basic introduction to the mathematical properties of signals, starting from Fourier series, and looked at the detailed derivation of the spectral analysis of square and triangular waveforms. The material was taken mostly from DSPP.125-135.

Homework 1 is assigned and is due in class on February 10.

Week 3

February 3

We started with more listening examples (noises, impulses, phase shifted sine waves, frequency shifted sine waves and beating, 3 and 7-path reflection models in action). Quick introduction to Steinberg's Wavelab audio editing/mastering program.

We reviewed continuous time Fourier Transform and basic linear systems concepts (impulse response, convolution, frequency response, spectrum - magnitude and phase, output of a linear system to a cosine input). Students should review this as well as sampling and the Discrete Fourier Transform in preparation of our discussion of A/D conversion.

Discussion of different microphone types: dynamic, condenser, ribbon, etc. Pickup patterns (omni, cardiod, super/hypercardioid, figure eight, hemispherical/boundary). Frequency response (on-axis, off-axis), low frequency response (need for elastic suspension and pop screens), proximity effect, transient response. Electrical parameters of microphones - frequency response, sensititivy, and noice. Example of calculation of sensitivity for a Neumann KM183 mic in dBV/dBSPL. We also looked at the characteristics of the Neumann TLM103 and Shure SM58.

The discussion was mostly drawn from TR.197-207, but with additional material from SRH.113-132 and MIC.97-105 . Sound examples were from MIC.

Supporting material

Sound examples: effect of mic type on voice (#2), proximity effect (#3), effect of mic types on instrumental sounds (#4), recording alternatives for acoustic guitar (#24).

Week 4

February 10

We started discussing the Precedence or Haas effect and its implications to stereophonic sound. Phantom image location using amplitude difference or time delay between channels (tracks 29, 30 from the Total Recording CD). Stereo using headphones; basics of surround sound. We then moved on to basic designs for loudspeaker drivers (electromagnetic and piezoelectric), low frequency drivers, enclosures - simple and vented, low frequency horns, high frequency drivers and horns. We examined passive and active crossovers, including their use for compensating vertical phase cancellation patterns. We also looked briefly at the specifications for the Mackie SRM 450 bi-amplified speaker. We finished our discussion with a typical scenario encountered in sound reinforcement where feedback loop may be created from the microphone and the speaker, and showed how directionality (of mic and speaker) can help to provide increased gain for an audience.

We then started discussing sampling. We covered basic sampling theory and aliasing, and we derived the spectrum of the sampled continuous-time signal. Next week we will continue along this track to discuss A/D conversion.

Supporting material

JBL Sound System Design Reference Manual (Part 1, Part 2)

Rane Corp., Linkwitz-Riley Crossovers

H. Haas, "The Influence of a Single Echo on the Audibility of Speech", Journal of the AES, March 1972 (originally written as a doctoral dissertation in 1949).

Week 5

February 17

Mathematical derivation for the computation of the spectrum of the sampled signal, and derivation of the Discrete-Time Fourier Transform. Antialiasing filters.D/A conversion as an interpolation problem. Ideal interpolation using low pass filtering, staircase reconstructor and its impulse response/frequency response. D/A equalization filters. Anti-image post-filters. Quantization, computation of noise energy, derivation of 6 dB SNR per bit of rule. Typical sampling rates/depths in audio and music. Oversampling A/D converters: delta modulation, overload and granular noise, second order prediction, sigma-delta modulation. Simplified model of sigma-delta system, computation of signal and noise transfer functions, spectrum of modulation noise for highly oversampled signals.

Supporting material

B. Blesser, "Digitization of Audio: A Comprehensive Examination of Theory, Implementation, and Current Practice", Journal of the AES, Vol. 26, No. 10, October 1978, pp. 739-771.

T. Baba, "Microphones for DSD Recordings", AES Conference, Microphones and Loudspeakers: The Ins & Outs of Audio, London, UK, 1998, pp. 40-43.

Week 6

March 2

Discussion of noise shaping in oversampled sigma-delta A/D converters, conversion from 1 bit at high sample rates to N bit at the target lower bitrate via digital low pass filtering. Sony DSD format for Super Audio CD (SACD). Dithering - motivation, examples from imaging and printing, generation of rectangular, triangular, and high-pass dither and corresponding spectra. Sound examples (using WaveLab) of dither when scaling from 16 bits to 8 bits. Apogee UV-22 and POW-r dither. Examples of A/D and D/A converters (other than oversampling); simple analysis of the R-2R D/A converter. Quality metrics for A/D converters and definition of Total Harmonic Distortion (TDH). Definition of THD plus noise metric (THD+N). Technical characteristics of the Texas Instruments PCM4202 24-bit 192 kHz stereo audio A/D converter chip.

Digital transmission and storage formats. Technical characteristics (number of channels, bits/sample, sampling rate, physical cabling) of professional and consumer grade transmission formats: S/PDIF, AES/EBU (AES-3), AES-10, ADAT, TDIF. Double-channel mode. (Note: we did not discuss the very important issues of synchronization and Word Clock, but we will at a later lecture.) File formats for audio - need for metadata and standardized layouts. Formats discussed: raw, Sun AU, Electronic Arts IFF and the concept of a "chunk", Apple AIFF and AIFC, Microsoft Wave (WAV) (and Broadcast Wave File (BWF).

Supporting material

K. Greenebaum, C. Hresko, A. Eleftheriadis, and D. Hong, "Audio File Formats: A Formal Description-Based Approach", in Audio Anecdotes, K. Greenebaum, Editor, A. K. Peters, 2004 (ISBN 1-56881-104-7). [This provides an overview of the different audio file formats using our Flavor language. Flavor itself is described in the companion chapter shown below.]

A. Eleftheriadis and D. Hong, "Using Flavor for (Media) Bitstream Representation", in Audio Anecdotes, K. Greenebaum, Editor, A. K. Peters, 2004 (ISBN 1-56881-104-7).

Texas Instruments (Burr Brown), Technical specifications of PCM4202 24-bit, 192 kHz A/D converter chip.

Texas Instruments (Burr Brown), Application Bulletin: Dynamic Performance Testing of Digital Audio D/A Converters (SBAA055, 1997).

Texas Instruments, Audio Solutions Guide, 2003.

Audio Engineering Society (AES) Web Site, Standards in Print. [Provides download access to all published AES specifications, including AES/EBU.]

Week 7

March 23

Patch bays - normalled and half-normalled. Mixing consoles. General functional components: inputs (mic, mic pad and trim, line level), signal processing (EQ filter banks, peaking vs. shelving filters), inserts (Y connections) and direct outs, faders and meters, mixing buses using active electronics, pan-pots and constant amplitude vs. constant power panning. Detailed console block diagrams for NxKxL designs. Split and inline (I/O) console designs. Submasters, master fader, monitoring, soloing, talkback. Overview of the Behringer MX3242X analog mixing console and lab demonstration.

The material is covered in pages 269-302 from "Total Recording". We will discuss digital consoles in the next lecture. The manuals for the two example consoles we will examine are available for download below.

Supporting material

Behringer Eurorack MX3242X Analog Mixing Console, Owner's Manual.

Yamaha 02R96 Digital Mixing Console, Owner's Manual.

Week 8

March 30

Digital Consoles: desk layout, I/O, patching, layers. On-board effects, automation, scene recall, use as control surface for DAWs, interfacing to hard disk recorders. Overview of the Yamaha 02R96.

Overview of Digidesign Pro Tools. The following is a brief and not exhaustive list of the things we covered. Overview of the program's visual layout (different screen areas and button groups in the mix and edit windows). Setups (hardware, I/O), Preferences. Track types. Playback and recording modes. Use of inserts and sends. Edit modes (shuffle, grid, etc.). Basic editing operations - region processing, crossfades, fade-outs.

Week 9

April 6

We continued our overview of Pro Tools. We covered the different plug-in types (TDM, RTAS, and AS) and their differences. MIDI support, software synthesizers as RTAS plug-ins (e.g., SampleTank), software synthesizers as ReWire clients (e.g., Reason). We showed how to record MIDI, how to create a click track using the Click plug-in, and how to create a tempo map from a pre-recorded track. Use of Master faders and bouncing to disk.

Week 10

April 13

We went quickly through Chapter 1 of the DAFX book, primarily the Spectrum Analysis section. We discussed the different families of effects and their categorization according to the domain in which they operate. We then focused on frequency selective filters, which is the subject of Chapter 2, covering the material until the discussion of FIR filtering.

Week 10 (Makeup)

April 16

Peaking/shelving filters. Filter-based effects: wha-wha, phaser. Delay-based systems: FIR, IIR, and universal comb filters. Fractional delay systems. Delay-based effects: slapback, echo, flanger, chorus. Natural sounding comb filter (IIR comb with filtered feedback path).

Week 12

April 20

Modulation/demodulation. Attack/release-based averaging for peak and rms estimation, dynamics of VU, PPM, and other meters. Note: from this chapter we are only interested in ring modulation, and the demodulation part. Introduction to non-linear processing. Dynamics processing and the side chain. Static characteristics for compressors, limiters, expanders, and gates, and their use in practical applications. Line equations. The dynamic behavior and RMS/peak measurement. Limiters.

Supporting material

At the presentation at NYU, Larry Loweinger distributed an interesting article that he wrote for a film magazine, that we thought should be interest to everyone. It doesn't relate to the subject matter of today's class, but I put the link here so that people would notice.

L. Loewinger, The Rationale Behind the Position of "Sound Designer" and Why it Never Took Hold, The Independent, the magazine of the Association of Independent Video and Film Makers, October 1998.

Week 13

April 27

Continuation of dynamics processing. Compressor and expander, noise gate, de-esser, infinite limiter. We did not cover section 5.3 (Nonlinear processors), except to touch upon overdrive, distortion, and fuzz (5.3.3). We covered briefly exciters and enhancers. We then moved on to spatial effects, with particular emphasis on reverb. We discussed basic concepts for distance and space rendering (panorama and precedence effect have been covered earlier in class), and the Doppler effect (6.2), and then moved on to reverberation (6.5). We discussed the allpass filter structure (universal comb filter), Schroeder's design, Moorer's improvement, as well as convolutional reverbs. We also mentioned feedback delay networks, but without undertaking a detailed mathematical analysis.


Outline of Topics

Audio and room/open air acoustics: frequency and time-domain characteristics.

Microphones and speakers for stereo and surround sound (design, characteristics, and placement).

A/D and D/A conversion, dithering, digital audio wire and file formats.

Mixing 1 – Design and architecture of analog and digital mixers. Case studies: Behringer MX3242X and Yamaha 02R96.

Mixing 2 – The mixing process, with an introduction to mastering.

Digital Audio Effects 1 – Filters, Delays, Modulators and Demodulators

Digital Audio Effects 2 – Nonlinear processing, spatial effects, time-segment and time-frequency processing.

Digital Audio Workstations 1 – Design and architecture. Case study: Digidesign Pro Tools.

Digital Audio Workstations 2 – Automation, soft-wired instruments, control surfaces. Case studies: Digidesign ProControl, Propellerheads Reason and ReWire.

Synthesizers and samplers, introduction to sound synthesis algorithms.

Back to Home Page


A. Eleftheriadis, eleft@ee.columbia.edu
General Syllabus Homeworks Projects Announcements