Department of Electrical Engineering - Columbia University

ELEN E6820 - Spring 2009

SPEECH AND AUDIO PROCESSING AND RECOGNITION

Home page

Course outline

Matlab scripts

Problem sets

Projects

Columbia Courseworks

Course outline

Links take you to the slide pack for that lecture, as soon as it is available. Currently, the links take you to the slide packs from last year. I will be revising these to a greater or lesser extend through the semester, and the actual slides to be used in the lecture will be posted by the night before class.

Lecture Date Topic Paper presentation

1 2009-01-20 Course introduction: DSP review, Timescale modification

2009-01-22

2 2009-01-27 Acoustics fundamentals: Sound, waves, waveguides, resonance, energy transfer. The Phase Vocoder, Flanagan & Golden, 1966 (Graham)

2009-01-29

3 2009-02-03 Machine learning, classification, and generative models Sound Synthesis of the Harpsichord Using a Computationally Efficient Physical Model, Välimäki et al., 2004

2009-02-05
4 2009-02-10 Auditory perception fundamentals: the ear, auditory physiology, psychophysics, auditory scene analysis A tutorial on hidden Markov models and selected applications in speech recognition, Rabiner, Proc. IEEE 1989 (Adrian)

2009-02-12
5 2009-02-17 Speech models and speech synthesis: LPC, cepstrum, harmonic+noise Chimaeric sounds reveal dichotomies in auditory perception, Smith, Delgutte, and Oxenham. Nature, 2002 (Jon) (here are the sound examples)

2009-02-19
6 2009-02-24 Nonspeech: Nonspeech and music signals, sinewave modeling Unit selection in a concatenative speech synthesis system using a large speech database, Hunt and Black, ICASSP 1996 (Promiti)

2009-02-26
7 2009-03-03 Catchup Robust pitch estimation with harmonics enhancement in noisy environments based on instantaneous frequency, Abe, Koboyashi, Imai, ICSLP 1996 (Dan)

2009-03-05
2009-03-10 Midterm Project proposals

2009-03-12
2009-03-17 Spring break - no lecture

2009-03-19
8 2009-03-24 Spatial sound & rendering Learning a Precedence Effect-Like Weighting Function for the Generalized Cross-Correlation Framework Wilson & Darrell TASLP, 2006 (Dan)

2009-03-26
9 2009-03-31 Compression: Speech coding & high-quality audio compression A Tutorial on MPEG/Audio compression, Pan, 1995 (Adrian)

2009-04-02
10 2009-04-07 Speech Recognition: Features, Hidden Markov Models Weighted finite-state transducers in speech recognition, Mohri, Pereira, Riley. Computer Speech and Language, 2002 (Andrew)

2009-04-09
11 2009-04-14 Sound mixtures & separation: CASA, ICA, and model-based separation Factorial models and refiltering for speech separation and denoising, Roweis, 2003

2009-04-16
12 2009-04-21 Music analysis & recognition: Transcription, summarization, and similarity Non-negative Matrix Factorization for Polyphonic Music Transcription, Smaragdis & Brown, 2003 (Graham)

2009-04-23
2009-04-28 Project presentations

2009-04-30
(squeezed out) Analysis of Everyday Sounds : Content-based retrieval of large-scale archives etc.

Dan Ellis <[email protected]>
Last updated: Tue Mar 31 02:54:38 PM EDT 2009