Dan Ellis : Publications (by year/topic)

See also publications organized by type.

Speech and Source Separation

Music Signal Analysis

Environmental/Marine

2014

Z. Chen, B. McFee, D. Ellis (2014)
Speech enhancement by low-rank and convolutive dictionary spectrogram decomposition
Proc. Interspeech,(to appear), Singapore, Sep 2014.

Z. Chen, H. Papadopoulos, D. Ellis (2014)
Content-adaptive speech enhancement by a sparsely-activated dictionary plus low rank decomposition
Proc. HSCMA, Nancy, May 2014.

D. Liang, D. Ellis, M. Hoffman, G. Mysore (2014)
Speech Decoloration Based On The Product-Of-Filters Model
Proc. ICASSP, (to appear), Florence, May 2014.

 

Colin Raffel and Brian McFee and Eric J. Humphrey and Justin Salamon and Oriol Nieto and Dawen Liang and Daniel P. W. Ellis (2014)
mir_eval: A Transparent Implementation of Common MIR Metrics
Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.

B. McFee, D. Ellis (2014)
Analyzing Song Structure With Spectral Clustering
Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.

D. Liang, J. Paisley, D. Ellis (2014)
Codebook-based Scalable Music Tagging With Poisson Matrix Factorization
Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.

H. Papadopoulos, D. Ellis (2014)
Music-content-adaptive robust principal component analysis for a semantically consistent separation of foreground and background in music audio signals
Proc. DAFx, (to appear), Erlangen, Sep 2014.

J. Salamon, E. Gomez, D. Ellis, G. Richard (2014)
Melody Extraction from Polyphonic Music Signals
IEEE Signal Processing Magazine, pp.118-134, March 2014.
DOI: 10.1109/MSP.2013.2271648

B. McFee, D. Ellis (2014)
Better Beat Tracking Through Robust Onset Aggregation
Proc. ICASSP, (to appear), Florence, May 2014.

B. McFee, D. Ellis (2014)
Learning To Segment Songs With Ordinal Linear Discriminant Analysis
Proc. ICASSP, (to appear), Florence, May 2014.

M. McVicar, D. Ellis, M. Goto (2014)
Leveraging Repeated Utterances for Improved Transcription of Chorus Lyrics from Sung Audio
Proc. ICASSP, (to appear), Florence, May 2014.

C. Raffel, D. Ellis (2014)
Estimating Timing and Channel Distortion Across Related Signals
Proc. ICASSP, (to appear), Florence, May 2014.

 

D. Ellis and H. Satoh and Z. Chen (2014)
Detecting proximity from personal audio recordings
Proc. Interspeech,(to appear), Singapore, Sep 2014.

2013

M. Graciarena, A. Alwan, D. Ellis, H.Franco, L. Ferrer, J. Hansen, A. Janin, B.-S. Lee, Y. Lei, V. Mitra, N. Morgan, S. O. Sadjadi, T.J. Tsai, N. Scheffer, L. N. Tan, B. Williams (2013)
All for One: Feature Combination for Highly Channel-Degraded Speech Activity Detection
Proc. Interspeech, Lyon, August 2013, paper 1338.

Z. Chen and D. Ellis (2013)
Speech Enhancement By Sparse, Low-Rank, And Dictionary Spectrogram Decomposition
Proc. IEEE WASPAA, Mohonk, October 2013.
DOI: 10.1109/WASPAA.2013.6701883

 

D. Liang, M. Hoffman, D. Ellis (2013)
Beta Process Sparse Nonnegative Matrix Factorization For Music
Proc. ISMIR, 375-380, Curitiba, November 2013 (Best Student Paper award).

D. Silva, H. Papadopoulos, G. Batista, D. Ellis (2013)
A Video Compression-Based Approach To Measure Music Structure Similarity
Proc. ISMIR, 95--100, Curitiba, November 2013.

D. Gillespie and D. Ellis (2013)
Modeling nonlinear circuits with linearized dynamical models via kernel regression
Proc. IEEE WASPAA, Mohonk, October 2013.
DOI: 10.1109/WASPAA.2013.6701830

 

D. Silva, V. de Souza, G. Batista, E. Keogh, D. Ellis (2013)
Applying Machine Learning and Audio Analysis Techniques to Insect Recognition in Intelligent Traps
Proc. ICMLA, Miami, December 2013.

C. Cotton and D. Ellis (2013)
Subband Autocorrelation Features for Video Soundtrack Classification
Proc. ICASSP-13, 8663-8666, Vancouver, May 2013.

2012

B.-S. Lee and D. Ellis (2012)
Noise Robust Pitch Tracking by Subband Autocorrelation Classification
Proc. Interspeech-12, Portland, September 2012, paper P3b.05.

J. McDermott, D. Ellis, H. Kawahara (2012)
Inharmonic Speech: A Tool for the Study of Speech Perception and Separation
Proc. SAPA-SCALE 2012, Portland, September 2012, 114-117.

 

T. Bertin-Mahieux and D. Ellis (2012)
Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude
Proc. ISMIR-12, Porto, October 2012, 241-246.

B. McFee, T. Bertin-Mahieux, D. Ellis, and G. Lanckriet (2012)
The Million Song Dataset Challenge
Proc. WWW-2012 AdMIRe Workshop, Lyon, April 2012, 909-916.

 

K. Su, M. Naaman, A. Gurjar, M. Patel, and D. Ellis (2012)
Making a Scene: Alignment of Complete Sets of Clips based on Pairwise Audio Match
Proc. ICMR-12, Hong Kong, June 2012, 26-33.

2011

R. Weiss, M. Mandel, and D. Ellis (2011)
Combining localization cues and source model constraints for binaural source separation
Speech Communication, vol. 53 no. 5, pp. 606-621, May 2011.
DOI: 10.1016/j.specom.2011.01.003

 

J. Devaney, M. Mandel, D. Ellis, I. Fujinaga (2011)
Automatically extracting performance data from recordings of trained singers
Psychomusicology: Music, Mind & Brain 21(1-2), pp. 108-136, 2011.

T. Bertin-Mahieux, D. Ellis, B. Whitman, and P. Lamere (2011)
The Million Song Dataset
Proc. ISMIR, pp. 591-596, Miami, October 2011.

D. Ellis, B. Whitman, and A. Porter (2011)
Echoprint - An Open Music Identification Service
Proc. ISMIR, late-breaking session, Miami, October 2011.

T. Bertin-Mahieux and D. Ellis (2011)
Large-Scale Cover Song Recognition Using Hashed Chroma Landmarks
Proc. IEEE WASPAA, pp. 117-120, Mohonk, October 2011.

G. Grindlay and D. Ellis (2011)
Transcribing Multi-instrument Polyphonic Music with Hierarchical Eigeninstruments
IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1159-1169, October 2011.
DOI: 10.1109/JSTSP.2011.2162395.

M. Mueller, D. Ellis, A. Klapuri, and G. Richard (2011)
Signal Processing for Music Analysis
IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1088-1110, October 2011.
DOI: 10.1109/JSTSP.2011.2112333.

M. Mueller, D. Ellis, A. Klapuri, G. Richard, and S. Sagayama (2011)
Introduction to the Special Issue on Music Signal Processing
IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1085-1087, October 2011.
DOI: 10.1109/JSTSP.2011.2165109.

T. Bertin-Mahieux, G. Grindlay, R. Weiss, and D. Ellis (2011)
Evaluating music sequence models through missing data
Proc. IEEE ICASSP, pp. 177-180, Prague, May 2011.

 

C. Cotton and D. Ellis (2011)
Spectral vs. Spectro-Temporal Features for Acoustic Event Detection
Proc. IEEE WASPAA, pp. 69-72, Mohonk, October 2011.

C. Cotton, D. Ellis , and A. Loui (2011)
Soundtrack classification by transient events
Proc. IEEE ICASSP, pp. 473-476, Prague, May 2011.

D. Ellis, X. Zheng, and J. McDermott (2011)
Classifying soundtracks with audio texture features
Proc. IEEE ICASSP, pp. 5880-5883, Prague, May 2011.

C. Vezyrtzis, A. Klein, D. Ellis, Y. Tsividis (2011)
Direct Processing of MPEG Audio Using Companding and BFP Techniques
Proc. IEEE ICASSP, pp. 361-364, Prague, May 2011.

Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui (2011)
Consumer Video Understanding: A Benchmark Database and An Evaluation of Human and Machine Performance
Proc. ACM ICMR, article #29, Trento, Apr 2011.

2010

M. Mandel, S. Bressler, B. Shinn-Cunningham, and D. Ellis (2010)
Evaluating Source Separation Algorithms With Reverberant Speech
IEEE Tr. Audio, Speech, and Lang. Proc., vol. 18 no. 7, pp. 1872-1883, September 2010.
DOI: 10.1109/TASL.2010.2052252

M. Mandel, R. Weiss, and D. Ellis (2010)
Model-Based Expectation-Maximization Source Separation and Localization
IEEE Tr. Audio, Speech, and Lang. Proc., vol. 18 no. 2, pp. 382-394, February 2010.
DOI: 10.1109/TASL.2009.2029711

R. Weiss and D. Ellis (2010)
Speech separation using speaker-adapted eigenvoice speech models
Computer Speech and Language, vol. 24 no. 1 pp. 16-29, Jan 2010.
DOI: 10.1016/j.csl.2008.03.003

D. Ellis (2010)
An introduction to signal processing for speech
chapter 22 in The Handbook of Phonetic Science, 2nd ed., ed. Hardcastle, Laver, and Gibbon. pp. 757-780, Blackwell.

 

G. Grindlay and D. Ellis (2010)
A Probabilistic Subspace Model for Multi-Instrument Polyphonic Transcription
Proc. ISMIR, pp. 21-26, Utrecht, August 2010.

T. Bertin-Mahieux, R. Weiss, and D. Ellis (2010)
Clustering beat-chroma patterns in a large music database
Proc. ISMIR, pp. 111-116, Utrecht, August 2010.

D. Ellis, B. Whitman, T. Jehan, and P. Lamere (2010)
The Echo Nest Musical Fingerprint
ISMIR Late Breaking Abstracts, Utrecht, August 2010.

D. Ellis and A. Weller (2010)
The 2010 LabROSA chord recognition system
MIREX 2010 system abstracts, August 2010.

S. Ravuri and D. Ellis (2010)
Cover Song Detection: From High Scores to General Classification
Proc. IEEE ICASSP, pp. 65-68, Dallas, March 2010.

 

K. Lee and D. Ellis (2010)
Audio-Based Semantic Concept Classification for Consumer Video
IEEE Tr. Audio, Speech and Lang. Proc. vol. 18 no. 6 pp. 1406-1416, Aug. 2010.
DOI: 10.1109/TASL.2009.2034776

C. Cotton and D. Ellis (2010)
Audio Fingerprinting to Identify Multiple Videos of an Event
Proc. IEEE ICASSP, pp. 2386-2389, Dallas, March 2010.

K. Lee, D. Ellis, and A. Loui (2010)
Detecting Local Semantic Concepts in Environmental Sounds using Markov Model based Clustering
Proc. IEEE ICASSP, pp. 2278-2281, Dallas, March 2010.

2009

M. Mandel and D. Ellis (2009)
The Ideal Interaural Parameter Mask: A Bound on Binaural Separation Systems
Proc. WASPAA-09, Mohonk NY, October 2009, pp. 85-88.

J. Gudnason, M. Thomas, P. Naylor, and D. Ellis (2009)
Voice Source Waveform Analysis and Synthesis using Principal Component Analysis and Gaussian Mixture Modelling
Proc. Interspeech-09, Brighton, September 2009, pp. 108--111.

J. B. Boldt and D. Ellis (2009)
A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation
Proc. EUSIPCO'09, Glasgow, August 2009, pp. 1849--1853.

R. Weiss and D. Ellis (2009)
A Variational EM Algorithm for Learning Eigenvoice Parameters in Mixed Signals
Proc. ICASSP-09, pp. 113-116, Taiwan, April 2009.

 

A. Weller, D. Ellis, and T. Jebara (2009)
Structured Prediction Models for Chord Transcription of Music Audio
Proc. ICMLA, Miami Beach FL, December 2009, pp. 590-595.

C. Smit and D. Ellis (2009)
Guided Harmonic Sinusoid Estimation in a Multi-Pitch Environment
Proc. WASPAA-09, Mohonk NY, October 2009, pp. 41-44.

G. Grindlay and D. Ellis (2009)
Multi-Voice Polyphonic Music Transcription Using Eigeninstruments
Proc. WASPAA-09, Mohonk NY, October 2009, pp. 53-56.

J. Devaney, M. Mandel, and D. Ellis (2009)
Improving Midi-Audio Alignment with Acoustic Features
Proc. WASPAA-09, Mohonk NY, October 2009, pp. 45-48.

J. H. Jensen, M. G. Christensen, D. P. W. Ellis, and S. H. Jensen (2009)
Quantitative analysis of a common audio similarity measure
IEEE Tr. Audio, Speech, Lang. Proc., vol. 17 no. 4 pp. 693-703, May 2009.

J. Devaney and D. Ellis (2009)
Handling Asynchrony in Audio-Score Alignment
Proc. ICMC-09, Montreal, pp. 29-32, August 2009.

 

C. Cotton and D. Ellis (2009)
Finding Similar Acoustic Events using Matching Pursuit and Locality-Sensitive Hashing
Proc. WASPAA-09, Mohonk NY, October 2009, pp. 125-128.

W. Jiang, C. Cotton, S.-F. Chang, D. Ellis, and A. Loui (2009)
Short-Term Audio-Visual Atoms for Generic Video Concept Classification
Proc. ACM MultiMedia-09, Beijing, October 2009, pp. 5-14.

2008

R. Weiss, M. Mandel, D. Ellis (2008)
Source Separation Based on Binaural Cues and Source Model Constraints
Proc. Interspeech-08, pp. 419-422, Brisbane, Australia, September 2008.

K. Hu, P. Divenyi, D. Ellis, Z. Jin, B. Shinn-Cunningham, D. Wang (2008)
Preliminary Intelligibility Tests of a Monaural Speech Segregation System
Proc. SAPA-08, pp. 11-16, Brisbane, Australia, September 2008.

A. Lammert, D. Ellis, P. Divenyi (2008)
Data-driven articulatory inversion incorporating articulator priors
Proc. SAPA-08, pp. 29-34, Brisbane, Australia, September 2008.

S. Ravuri and D. Ellis (2008)
Stylization of Pitch with Syllable-Based Linear Segments
Proc. ICASSP-08, pp. 3985-3988, Las Vegas, April 2008.

 

M. Mandel and D. Ellis (2008)
Multiple-Instance Learning For Music Information Retrieval
Proc. ISMIR 2008, pp. 577-582, Philadelphia, September 2008.

J. Devaney and D. Ellis (2008)
An Empirical Approach to Studying Intonation Tendencies in Polyphonic Vocal Performances
J. Interdisc. Music Studies, vol. 2 no. 1-2, Spring/Fall 2008, pp. 141-156. (16pp)

M. Mandel and D. Ellis (2008)
A Web-based Game for Collecting Music Metadata
J. New Music Research, vol. 37 no. 2, pp. 151-165, 2008.

D. Ellis, C. Cotton, and M. Mandel (2008)
Cross-Correlation of Beat-Synchronous Representations for Music Similarity
Proc. ICASSP-08, pp. 57-60, Las Vegas, April 2008.
(See also the talk slides.)

J. H. Jensen, M. G. Christensen, D. Ellis, and S. H. Jensen (2008)
A Tempo-Insensitive Distance Measure for Cover Song Identification based on Chroma Features
Proc. ICASSP-08, pp. 2209-2212, Las Vegas, April 2008.

M. Slaney, D. Ellis, M. Sandler, M. Goto, M Goodwin (2008)
Introduction to the Special Issue on Music Information Retrieval
IEEE Tr. Audio, Speech, Lang. Proc. vol. 16 no.2, pp. 253-254, Feb 2008.

 

K. Lee and D. Ellis (2008)
Detecting Music in Ambient Audio by Long-Window Autocorrelation
Proc. ICASSP-08, pp. 9-12, Las Vegas, April 2008.

2007

M. Mandel and D. Ellis (2007)
EM localization and separation using interaural level and phase cues
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, pp. 275-278, Mohonk NY, October 2007.

R. Weiss and D. Ellis (2007)
Monaural speech separation using source-adapted models
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, pp. 114-117, Mohonk NY, October 2007.

M. Athineos and D. Ellis (2007)
Autoregressive Modeling of Temporal Envelopes
IEEE Tr. Signal Processing, vol. 15 no. 11, pp. 5237-5245, Nov 2007.

P. Scanlon, D. Ellis, R. Reilly (2007)
Using Broad Phonetic Group Experts for Improved Speech Recognition
IEEE Tr. Audio, Speech, Lang. Proc., vol. 15 no. 3, pp. 803-812, March 2007.

 

C. Smit and D. Ellis (2007)
Solo voice detection via optimal cancelation
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, pp. 207-210, Mohonk NY, October 2007.

G. Poliner and D. Ellis (2007)
Improving generalization for polyphonic piano transcription
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, pp. 86-89, Mohonk NY, October 2007.

D. Ellis (2007)
Classifying Music Audio with Timbral and Chroma Features
Proc. ISMIR-07, pp. 339-340, Vienna, Austria, October 2007.
(See also the poster I presented at ISMIR-07.)

M. Mandel and D. Ellis (2007)
A Web-Based Game for Collecting Music Metadata
Proc. Int. Conf. on Music Info. Retrieval ISMIR-07, pp. 365-366, Vienna, Austria, October 2007.
(See also the 6 page tech. report.)

J. H. Jensen, D. Ellis, M. G. Christensen, S. H. Jensen (2007)
Evaluation Distance Measures Between Gaussian Mixture Models of MFCCs
Proc. Int. Conf. on Music Info. Retrieval ISMIR-07, pp. 107-108, Vienna, Austria, October 2007.

D. Ellis and C. Cotton (2007)
The 2007 LabROSA Cover Song Detection System
MIREX 2007 Audio Cover Song Evaluation system description, Sep 2007. (4pp)
(See also the poster I presented at ISMIR-07.)

D. Ellis (2007)
Beat Tracking by Dynamic Programming
J. New Music Research, Special Issue on Beat and Tempo Extraction, vol. 36 no. 1, March 2007, pp. 51-60. (10pp)
DOI: 10.1080/09298210701653344

D. Ellis and G. Poliner (2007)
Identifying Cover Songs With Chroma Features and Dynamic Programming Beat Tracking
Proc. ICASSP-07 Hawai'i, pp. IV-1429-1432.

G. Poliner, D. Ellis, A. Ehmann, E. Gómez, S. Streich, B. Ong (2007)
Melody Transcription from Music Audio: Approaches and Evaluation
IEEE Tr. Audio, Speech, Lang. Proc., vol. 14 no. 4, May 2007, pp. 1247-1256.

G. Poliner and D. Ellis (2007)
A Discriminative Model for Polyphonic Piano Transcription
Eurasip Journal of Advances in Signal Processing, special issue on Music Signal Processing, 2007 (2007), Article ID 48317. (9pp)
DOI: 10.1155/2007/48317

 

S.-F. Chang, D. Ellis, W. Jiang, K. Lee, A. Yanagawa, A. Loui, J. Luo (2007)
Large-scale multimodal semantic concept detection for consumer video
Multimedia Information Retrieval workshop, ACM Multimedia Augsburg, Germany, Sep 2007, pp. 255-264.
DOI: 10.1145/1290082.1290118

J. Ogle and D. Ellis (2007)
Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings
Proc. ICASSP-07 Hawai'i, pp.I-233-236. (4pp)

A. Doherty, A. Smeaton, K.-S. Lee, and D. Ellis (2007)
Multimodal Segmentation of Lifelog Data
Proc. 8th Int. Conf. on Computer-Assisted Information Retrieval RIAO 2007, Pittsburgh, May 2007. (18pp)

2006

M. Mandel, D. Ellis, and T. Jebara (2006)
An EM algorithm for localizing multiple sound sources in reverberant environments
Advances Neural Info. Proc. Sys. 19, Vancouver CA, Dec 2006, pp. 953-960. (8pp)

D. Ellis (2006)
Model-Based Scene Analysis
Chapter 4 of Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D. Wang & G. Brown, eds., Wiley/IEEE Press, pp. 115-146, 2006. (46pp)

M. Mandel and D. Ellis (2006)
A probability model for interaural phase difference
Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 1-6, Pittsburgh PA, Oct 2006. (6pp)

R. Weiss and D. Ellis (2006)
Estimating single-channel source separation masks: Relevance Vector Machine classifiers vs. pitch-based masking
Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 31-36, Pittsburgh PA, Oct 2006. (6pp)

D. Ellis and R. Weiss (2006)
Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation
Proc. ICASSP-06, Toulouse, May 2006, pp. V-957-960. (4pp)

D. Ellis (2006)
Modeling the auditory component of speech
Chapter 24 of Listening to speech: An auditory perspective, S. Greenberg & W. Ainsworth, eds., Lawrence Erlbaum, pp.393-307, 2006. (13pp)

D. Ellis, B. Raj, J. Brown, M. Slaney, P. Smaragdis (2006)
Editorial - Special Section on Statistical and Perceptual Audio Processing
IEEE Tr. Audio, Speech and Lang. Proc., vol 14 no 1, pp. 2-4, Jan. 2006. (3pp)

 

D. Ellis (2006)
Extracting Information from Music Audio
Communications of the ACM invited paper, special issue on Music Information Retrieval, vol. 49, no. 8, pp.32-37, August 2006. (6pp)

D. Ellis and G. Poliner (2006)
Classification-Based Melody Transcription
Machine Learning, special issue on Machine Learning In and For Music, vol. 65, no. 2-3, pp. 439-456, Dec 2006. (18pp)
DOI: 10.1007/s10994-006-8373-9

M. Mandel, G. Poliner, D. Ellis (2006)
Support Vector Machine Active Learning for Music Retrieval
Multimedia Systems, special issue on Machine Learning Approaches to Multimedia Information Retrieval, vol. 12, no. 1, pp. 3-13, Aug 2006. (10pp)
DOI: 100.1007/s00530-006-0032-2

D. Ellis (2006)
Identifying `Cover Songs' with Beat-Synchronous Chroma Features
MIREX 2006 Audio Cover Song Contest system description, Sep 2006. (4pp)

D. Ellis (2006)
Beat Tracking with Dynamic Programming
MIREX 2006 Audio Beat Tracking Contest system description, Sep 2006. (3pp)

 

K. Lee and D. Ellis (2006)
Voice Activity Detection in Personal Audio Recordings Using Autocorrelogram Compensation
Interspeech ICSLP-06, pp. 1970-1973, Pittsburgh, Oct 2006. (4pp)

D. Ellis and K. Lee (2006)
Accessing minimal-impact personal audio archives
IEEE MultiMedia, vol. 13 no. 4, Oct-Dec 2006, pp. 30-38. (9pp)

X. Halkias and D. Ellis (2006)
Call detection and extraction using Bayesian inference
Applied Acoustics, special issue on Marine Mammal Detection, vol. 67, no. 11-12, Nov-Dec. 2006, pp. 1164-1174 (11pp).

X. Halkias and D. Ellis (2006)
Estimating the Number of Marine Mammals using Recordings of Clicks from One Microphone
Proc. ICASSP-06, Toulouse, May 2006, pp. V-769-772. (4pp).

2005

N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard, and M. Athineos (2005)
Pushing the Envelope -- Aside
IEEE Signal Processing Magazine 22(5), pp. 81-88, Sep. 2005. (8pp)

C.-P. Chen, J. Bilmes, D. Ellis (2005)
Speech Feature Smoothing for Robust ASR
Proc. ICASSP-05, Philadelphia, March 2005, pp. I-525-528. (4pp)

M. Reyes-Gomez, N. Jojic, and D. Ellis (2005)
Deformable Spectrograms
AI & Statistics 2005, Barbados, Jan. 2005, pp. 285-292. (8pp)

J. Barker, M. Cooke, D. Ellis (2005)
Decoding speech in the presence of other sources
Speech Communication, 45(1), Jan. 2005, pp. 5-25. (26pp)

 

G. Poliner, D. Ellis (2005)
A Classification Approach to Melody Transcription
Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.161-166. (6pp)

M. Mandel, D. Ellis (2005)
Song-Level Features and Support Vector Machines for Music Classification
Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.594-599. (6pp)

 

K. Dobson, B. Whitman, D. Ellis (2005)
Learning Auditory Models of Machine Voices
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-05, Mohonk NY, October 2005, pp. 339-342. (4pp)

N. Lesser, D. Ellis (2005)
Clap Detection and Discrimination for Rhythm Therapy
Proc. ICASSP-05, Philadelphia, March 2005, pp. III-37-40. (4pp)
(See also the talk slides which describe an energy ratio feature that does much better than the ones described in the paper.)

2004

M. Athineos, H. Hermansky and D. Ellis (2004)
LP-TRAP: Linear predictive temporal patterns
International Conference on Spoken Language Processing ICSLP-04, Jeju, Korea, Oct 2004, pp. 949-952. (4pp)

M. Athineos, H. Hermansky and D. Ellis (2004)
PLP^2: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns
ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 37-42. (5pp)

M. Reyes-Gomez, N. Jojic, and D. Ellis (2004)
Towards single-channel unsupervised source separation of speech mixtures: The layered harmonics/formants separation-tracking model
ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 25-30. (6pp)

D. Ellis and J. Liu (2004)
Speaker turn segmentation based on between-channel differences
NIST Meeting Recognition Workshop @ ICASSP, pp. 112-117, Montreal, May 2004. (6pp)

L. Kennedy and D. Ellis (2004)
Laughter Detection in Meetings
NIST Meeting Recognition Workshop @ ICASSP, pp. 118-121, Montreal, May 2004. (4pp)

M.J. Reyes-Gomez, D. Ellis, N. Jojic (2004)
Multiband Audio Modeling for Single Channel Acoustic Source Separation
Proc. ICASSP-04, pp. V-641-644, Montreal, May 2004. (4pp)

M.J. Reyes-Gomez, N. Jojic, D. Ellis (2004)
Detailed graphical models for source separation and missing data interpolation in audio
Snowbird Learning Workshop, Snowbird, 2004. (2pp)

D. Ellis (2004)
Evaluating Speech Separation Systems
Chapter 20 in Speech Separation by Humans and Machines, ed. P. Divenyi, Kluwer, pp. 295-304. (12 pp)

M. Cooke and D. Ellis (2004)
Introduction to the special issue on the recognition and organization of real-world sound
Speech Communication, 43(4), Sep. 2004, pp. 273-274. (2pp)
doi: 10.1016/j.specom.2004.05.001.

 

D. Ellis and J. Arroyo (2004)
Eigenrhythms: Drum pattern basis sets for classification and generation
International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 554-559. (6pp)
(longer tech report version with color figures)

B. Whitman and D. Ellis (2004)
Automatic Record Reviews
International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 470-477. (8pp)

A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2004)
A large-scale evaluation of acoustic and subjective music-similarity measures
Computer Music Journal, 28(2), pp. 63-76, June 2004. (14pp)

 

D. Ellis and K.S. Lee (2004)
Minimal-Impact Audio-Based Personal Archives
First ACM workshop on Continuous Archiving and Recording of Personal Experiences CARPE-04, New York, Oct 2004, pp. 39-47. (9pp)

D. Ellis and K.S. Lee (2004)
Features for Segmenting and Classifying Long-Duration Recordings of Personal Audio
ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 1-6. (6pp)

2003

L. Kennedy and D. Ellis (2003)
Pitch-based emphasis detection for characterization of meeting recordings
Automatic Speech Recognition and Understanding Workshop IEEE ASRU 2003, pp. 243-248, St. Thomas, December 2003. (6pp)

M. Athineos and D. Ellis (2003)
Frequency-domain linear prediction for temporal features
Automatic Speech Recognition and Understanding Workshop IEEE ASRU 2003, pp. 261-266, St. Thomas, December 2003. (6pp)

M.J. Reyes-Gomez, B. Raj, D. Ellis (2003)
Multi-channel Source Separation by Beamforming Trained with Factorial HMMs
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, pp. 13-16, Mohonk NY, October 2003. (4pp)

P. Scanlon, D. Ellis, R. Reilly (2003)
Using Mutual Information to design class-specific phone recognizers
Proc. Eurospeech-03, Geneva, September 2003, pp. 857-860. (4pp)

S. Renals and D. Ellis (2003)
Audio Information Access from Meeting Rooms
Proc. ICASSP-03, Hong Kong, April 2003, pp. IV-744--747. (4pp)

M.J. Reyes-Gomez, B. Raj, D. Ellis (2003)
Multi-channel Source Separation by Factorial HMMs
Proc. ICASSP-03, Hong Kong, April 2003, pp. I-664--667. (4pp)

A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, C. Wooters (2003)
The ICSI Meeting Corpus
Proc. ICASSP-03, Hong Kong, April 2003. pp. I-364--367. (4pp)

 

A. Sheh and D. Ellis (2003)
Chord Segmentation and Recognition using EM-Trained Hidden Markov Models
4th International Symposium on Music Information Retrieval ISMIR-03, pp. 185-191, Baltimore, October 2003. (7pp)

R. Turetsky and D. Ellis (2003)
Ground-Truth Transcriptions of Real Music from Force-Aligned MIDI Syntheses
4th International Symposium on Music Information Retrieval ISMIR-03, pp. 135-141, Baltimore, October 2003. (7pp)

A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2003)
A large-scale evaluation of acoustic and subjective music similarity measures
4th International Symposium on Music Information Retrieval ISMIR-03, pp. 103-109, Baltimore, October 2003. (7pp)

B. Logan, D. Ellis, A. Berenzweig (2003)
Toward evaluation techniques for music similarity
Keynote address, Workshop on the Evaluation of Music Information Retrieval (MIR) Systems at SIGIR 2003, Toronto, August 2003. (5pp)

A. Berenzweig, D. Ellis & S. Lawrence (2003)
Anchor Space for Classification and Similarity Measurement of Music
Proc. ICME-03, Baltimore, July 2003, pp. I-29--32. (4pp)

 

M.J. Reyes-Gomez and D. Ellis (2003)
Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modeling
Proc. ICME-03, Baltimore, July 2003, pp. I-73--76. (4pp)

M. Athineos and D. Ellis (2003)
Sound Texture Modelling with Linear Prediction in both Time and Frequency Domains
Proc. ICASSP-03, Hong Kong, April 2003, pp. V-648--651. (4pp)

2002

A.J. Robinson, G.D. Cook, D. Ellis, E. Fosler-Lussier, S.J. Renals, D.A.G. Williams (2002)
Connectionist speech recognition of Broadcast News
Speech Communication, vol. 37 no. 1-2, May 2002, pp. 27-45. (19pp)

M.J. Reyes-Gomez and D. Ellis (2002)
Error visualization for tandem acoustic modeling on the Aurora task
ICASSP-02 (student session), Orlando, May 2002. (4pp)

 

D. Ellis, B. Whitman, A. Berenzweig, S. Lawrence (2002)
The Quest for Ground Truth in Musical Artist Similarity
Proc. ISMIR-02, pp. 170-177, Paris, October 2002. (8pp)

A. Berenzweig, D. Ellis, S. Lawrence (2002)
Using Voice Segments to Improve Artist Classification of Music
Proc. AES-22 Intl. Conf. on Virt., Synth., and Ent. Audio. Espoo, Finland, June 2002. (8pp)

 

2001

T. Pfau, D. Ellis, A. Stolcke (2001)
Multispeaker Speech Activity Detection for the ICSI Meeting Recorder
Proc. ASRU-01, Italy, December 2001. (4pp)

J. Barker, M. Cooke, D. Ellis (2001)
Integrating bottom-up and top-down constraints to achieve robust ASR: The multisource decoder
Presented at the CRAC workshop, pp. 63-66, Aalborg, Denmark, September 2001. (4pp)

D. Ellis and M.J. Reyes Gomez (2001)
Investigations into Tandem Acoustic Modeling for the Aurora Task
Proc. Eurospeech-01, Special Event on Noise Robust Recognition, pp. 189-192, Denmark, September 2001. (4pp)
(See also the poster I presented at the conference.)

M. Cooke and D. Ellis (2001)
The auditory organization of speech and other sources in listeners and computational models
Speech Communication, vol. 35 no. 3-4, Oct. 2001, pp. 141-177. (37pp)

D. Ellis, R. Singh, S. Sivadas (2001)
Tandem acoustic modeling in large-vocabulary recognition
Proc. ICASSP-2001, pp. I-517-520, Salt Lake City, May 2001. (4pp)
(See also the poster I presented at the conference.)

N. Morgan, D. Baron, J. Edwards, D. Ellis, D. Gelbart, A. Janin, T. Pfau, E. Shriberg, A. Stolcke (2001)
The Meeting Project at ICSI
Human Language Technologies Conference, San Diego, March 2001, pp. 246-252. (7pp)

 

A.L. Berenzweig and D. Ellis (2001)
Locating Singing Voice Segments within Music Signals
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, pp. 119-122, Mohonk NY, October 2001. (4pp)

 

D. Ellis (2001)
Detecting Alarm Sounds
Presented at the CRAC workshop, pp. 59-62, Aalborg, Denmark, September 2001. (4pp)
(See also the poster I presented at the workshop.)

2000

D. Ellis and J.A. Bilmes (2000)
Using mutual information to design feature combinations
Proc. ICSLP-2000, Beijing, October 2000. (4pp)

J. Barker, M. Cooke and D. Ellis (2000)
Decoding speech in the presence of other sound sources
Proc. ICSLP-2000, Beijing, October 2000. (4pp)

J. Ferreiros-Lopez and D. Ellis (2000)
Using acoustic condition clustering to improve acoustic change detection on Broadcast News
Proc. ICSLP-2000, Beijing, October 2000. (4pp)

D. Ellis (2000)
Improved recognition by combining different features and different systems
Proc. AVIOS-2000, San Jose, May 2000. (7pp)

D. Ellis (2000)
Stream combination before and/or after the acoustic model
Rejected from ICASSP-2000, now an ICSI tech. report. (4pp)

H. Hermansky, D. Ellis and S. Sharma (2000)
Tandem connectionist feature stream extraction for conventional HMM systems
Proc. ICASSP-2000, Istanbul, III-1635-1638. (4pp)
(See also the poster I presented at the conference.)

S. Sharma, D. Ellis, S. Kajarekar, P. Jain and H. Hermansky (2000)
Feature extraction using non-linear transformation for robust speech recognition on the Aurora database
Proc. ICASSP-2000, Istanbul, II-1117-1120. (4pp)

1999

D. Genoud, D. Ellis and N. Morgan (1999)
Combined speech and speaker recognition with speaker-adapted connectionist models
Proc. Auto. Speech Recog. & Understanding Workshop, Keystone. (4pp)

D. Abberley, S. Renals, T. Robinson and D. Ellis (1999)
The THISL SDR system at TREC-8
Proc. Text Retrieval Conference 8, Washington. (6pp)

G. Williams and D. Ellis (1999)
Speech/music discrimination based on posterior probability features
Proc. Eurospeech-99, Budapest. (4 pp)

A. Janin, D. Ellis and N. Morgan (1999)
Multi-stream speech recognition: Ready for prime time?
Proc. Eurospeech-99, Budapest. (4 pp)

D. Ellis and N. Morgan (1999)
Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition
Proc. ICASSP-99, Phoenix. (4 pp)

N. Morgan, D. Ellis, E. Fosler-Lussier, A. Janin and B. Kingsbury (1999)
Reducing errors by increasing the error rate: MLP Acoustic Modeling for Broadcast News Transcription
Presented at the DARPA Broadcast News Transcription and Understanding Workshop, Gaithersburg VA, 1999feb28. (4pp)

G. Cook, J. Christie, D. Ellis, E. Fosler-Lussier, Y. Gotoh, B. Kingsbury, N. Morgan, S. Renals, T. Robinson and G. Williams (1999)
The SPRACH System for the Transcription of Broadcast News
Presented at the DARPA Broadcast News Transcription and Understanding Workshop, Gaithersburg VA, 1999feb28. (4pp)

1998 and before

D. Ellis (1998)
Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis, and its application to speech/nonspeech mixtures
Speech Communication special issue on Computational Auditory Scene Analysis, M. Cooke & H. Okuno, eds., vol. 27 no. 3-4, April 1999, pp. 281-298. (11pp)

D. Ellis and D.F Rosenthal (1998)
Mid-level representations for Computational Auditory Scene Analysis
Chapter 17 in Computational auditory scene analysis, D. F. Rosenthal and H. Okuno, eds., Lawrence Erlbaum, pp. 257-272, 1998. (7pp)
(also appeared in Proc. Intl. Joint Conf. on Artif. Intell. Workshop on Computational Auditory Scene Analysis, Montreal, August 1995.)

D. Ellis (1997)
The Weft: A representation for periodic sounds
Proc. Int. Conf. on Acous., Speech & Sig. Proc. ICASSP-97, Munich, vol. 2 pp. 1307-1310, April 1997. (4pp)
(See also the poster I presented at the conference.)

D. Ellis (1997)
Computational Auditory Scene Analysis exploiting Speech-Recognition knowledge
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1997. (4pp)

D. Ellis (1996)
Prediction-driven computational auditory scene analysis for dense sound mixtures
Proc. ESCA Workshop on the Auditory Basis of Speech Perception, Keele, July 1996. (6pp)

D. Ellis (1996)
Prediction-driven computational auditory scene analysis
Ph.D. thesis, Dept. of Elec. Eng & Comp. Sci., M.I.T., June 1996. (180pp)

D. Ellis (1995)
Underconstrained stochastic representations for top-down computational auditory scene analysis
Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1995. (4pp)

D. Ellis (1995)
Hard problems in computational auditory scene analysis
Posted to the AUDITORY email list, August 1995. (4pp)

D. Ellis (1994)
A computer implementation of psychoacoustic grouping rules
Proc. 12th Intl. Conf. on Pattern Recognition, Jerusalem, October 1994. (9pp)

D. Ellis (1993)
Vowel separation by glottal-pulse synchrony
Presented to the 126th meeting of the Acoustical Society of America, Denver, November 1993. (17pp)

D. Ellis (1993)
Hierarchic models of sound for separation and restoration
Proc. 1993 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio, October 1993. (4pp)

D. Ellis and B.L. Vercoe (1992)
A perceptual representation of sound for auditory signal separation
Presented to the 123rd meeting of the Acoustical Society of America, Salt Lake City, May 1992. (8pp)

D. Ellis (1992)
A Perceptual Representation of Audio
Master's thesis, EECS dept, MIT, February 1992. (88pp)


Valid HTML 4.01! Last updated: $Date: 2006/11/06 22:12:53 $
Dan Ellis <dpwe@ee.columbia.edu>