Dan Ellis : Publications (by type)

See also publications organized by topic and year.

Journal papers - Book chapters - Theses - Interational refereed conferences

Journal papers

  1. J. Salamon, E. Gomez, D. Ellis, G. Richard (2014)
    Melody Extraction from Polyphonic Music Signals
    IEEE Signal Processing Magazine, pp.118-134, March 2014.
    DOI: 10.1109/MSP.2013.2271648

  2. J. Devaney, M. Mandel, D. Ellis, I. Fujinaga (2011)
    Automatically extracting performance data from recordings of trained singers
    Psychomusicology: Music, Mind & Brain 21(1-2), pp. 108-136, 2011.

  3. G. Grindlay and D. Ellis (2011)
    Transcribing Multi-instrument Polyphonic Music with Hierarchical Eigeninstruments
    IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1159-1169, October 2011.
    DOI: 10.1109/JSTSP.2011.2162395.

  4. M. Mueller, D. Ellis, A. Klapuri, and G. Richard (2011)
    Signal Processing for Music Analysis
    IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1088-1110, October 2011.
    DOI: 10.1109/JSTSP.2011.2112333.

  5. M. Mueller, D. Ellis, A. Klapuri, G. Richard, and S. Sagayama (2011)
    Introduction to the Special Issue on Music Signal Processing
    IEEE J. Sel. Topics in Sig. Process., vol. 5 no. 6, pp. 1085-1087, October 2011.
    DOI: 10.1109/JSTSP.2011.2165109.

  6. R. Weiss, M. Mandel, and D. Ellis (2011)
    Combining localization cues and source model constraints for binaural source separation
    Speech Communication, vol. 53 no. 5, pp. 606-621, May 2011.
    DOI: 10.1016/j.specom.2011.01.003

  7. M. Mandel, S. Bressler, B. Shinn-Cunningham, and D. Ellis (2010)
    Evaluating Source Separation Algorithms With Reverberant Speech
    IEEE Tr. Audio, Speech, and Lang. Proc., vol. 18 no. 7, pp. 1872-1883, September 2010.
    DOI: 10.1109/TASL.2010.2052252

  8. K. Lee and D. Ellis (2010)
    Audio-Based Semantic Concept Classification for Consumer Video
    IEEE Tr. Audio, Speech and Lang. Proc. vol. 18 no. 6 pp. 1406-1416, Aug. 2010.
    DOI: 10.1109/TASL.2009.2034776

  9. M. Mandel, R. Weiss, and D. Ellis (2010)
    Model-Based Expectation-Maximization Source Separation and Localization
    IEEE Tr. Audio, Speech, and Lang. Proc., vol. 18 no. 2, pp. 382-394, February 2010.
    DOI: 10.1109/TASL.2009.2029711

  10. R. Weiss and D. Ellis (2010)
    Speech separation using speaker-adapted eigenvoice speech models
    Computer Speech and Language, vol. 24 no. 1 pp. 16-29, Jan 2010.
    DOI: 10.1016/j.csl.2008.03.003

  11. J. H. Jensen, M. G. Christensen, D. P. W. Ellis, and S. H. Jensen (2009)
    Quantitative analysis of a common audio similarity measure
    IEEE Tr. Audio, Speech, Lang. Proc., vol. 17 no. 4 pp. 693-703, May 2009.

  12. M. Mandel and D. Ellis (2008)
    A Web-based Game for Collecting Music Metadata
    J. New Music Research, vol. 37 no. 2, pp. 151-165, 2008.

  13. J. Devaney and D. Ellis (2008)
    An Empirical Approach to Studying Intonation Tendencies in Polyphonic Vocal Performances
    J. Interdisc. Music Studies, vol. 2 no. 1-2, Spring/Fall 2008, pp. 141-156. (16pp)

  14. M. Slaney, D. Ellis, M. Sandler, M. Goto, M Goodwin (2008)
    Introduction to the Special Issue on Music Information Retrieval
    IEEE Tr. Audio, Speech, Lang. Proc. vol. 16 no. 2, Feb 2008, pp. 253-254. (2pp)

  15. M. Athineos and D. Ellis (2007)
    Autoregressive Modeling of Temporal Envelopes
    IEEE Tr. Signal Processing, vol. 15 no. 11, Nov 2007, pp. 5237-5245. (9pp)

  16. G. Poliner, D. Ellis, A. Ehmann, E. Gómez, S. Streich, B. Ong (2007)
    Melody Transcription from Music Audio: Approaches and Evaluation
    IEEE Tr. Audio, Speech, Lang. Proc., vol. 14 no. 4, May 2007, pp. 1247-1256. (10pp)

  17. D. Ellis (2007)
    Beat Tracking by Dynamic Programming
    J. New Music Research, Special Issue on Beat and Tempo Extraction, vol. 36 no. 1, March 2007, pp. 51-60. (10pp)
    DOI: 10.1080/09298210701653344

  18. P. Scanlon, D. Ellis, R. Reilly (2007)
    Using Broad Phonetic Group Experts for Improved Speech Recognition
    IEEE Tr. Audio, Speech, Lang. Proc., vol. 15 no. 3, March 2007, pp. 803-812. (10pp)

  19. D. Ellis and K. Lee (2006)
    Accessing minimal-impact personal audio archives
    IEEE MultiMedia, vol. 13 no. 4, Oct-Dec 2006, pp. 30-38. (9pp)

  20. G. Poliner and D. Ellis (2006)
    A Discriminative Model for Polyphonic Piano Transcription
    Eurasip Journal of Advances in Signal Processing, special issue on Music Signal Processing, 2007 (2007), Article ID 48317. (9pp)
    DOI: 10.1155/2007/48317

  21. D. Ellis (2006)
    Extracting Information from Music Audio
    Communications of the ACM, invited paper, special issue on Music Information Retrieval, vol. 49 no. 8, August 2006, pp.32-37. (6pp)

  22. D. Ellis and G. Poliner (2006)
    Classification-Based Melody Transcription
    Machine Learning, special issue on Machine Learning In and For Music, vol. 65 no. 2-3, Dec 2006, pp. 439-456. (18pp)
    DOI: 10.1007/s10994-006-8373-9

  23. M. Mandel, G. Poliner, D. Ellis (2006)
    Support Vector Machine Active Learning for Music Retrieval
    Multimedia Systems, special issue on Machine Learning Approaches to Multimedia Information Retrieval, vol. 12 no. 1, Aug 2006, pp. 3-13. (10pp)
    DOI: 100.1007/s00530-006-0032-2

  24. D. Ellis, B. Raj, J. Brown, M. Slaney, P. Smaragdis (2006)
    Editorial - Special Section on Statistical and Perceptual Audio Processing
    IEEE Tr. Audio, Speech and Lang. Proc., vol. 14 no 1, Jan. 2006, pp. 2-4. (3pp)

  25. X. Halkias and D. Ellis (2006)
    Call detection and extraction using Bayesian inference
    Applied Acoustics, special issue on Marine Mammal Detection, vol. 67 no. 11-12, Nov-Dec. 2006, pp. 1164-1174. (11pp)

  26. N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard, and M. Athineos (2005)
    Pushing the Envelope -- Aside
    IEEE Signal Processing Magazine 22(5), Sep. 2005, pp. 81-88. (8pp)

  27. J. Barker, M. Cooke, D. Ellis (2005)
    Decoding speech in the presence of other sources
    Speech Communication, 45(1), Jan. 2005, pp. 5-25. (26pp)

  28. M. Cooke and D. Ellis (2004)
    Introduction to the special issue on the recognition and organization of real-world sound
    Speech Communication, 43(4), Sep. 2004, pp. 273-274. (2pp)
    doi: 10.1016/j.specom.2004.05.001.

  29. A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2004)
    A large-scale evaluation of acoustic and subjective music-similarity measures
    Computer Music Journal, 28(2), June 2004, pp. 63-76. (14pp)

  30. A.J. Robinson, G.D. Cook, D. Ellis, E. Fosler-Lussier, S.J. Renals, D.A.G. Williams (2002)
    Connectionist speech recognition of Broadcast News
    Speech Communication, vol. 37 no. 1-2, May 2002, pp. 27-45. (19pp)

  31. M. Cooke and D. Ellis (2001)
    The auditory organization of speech and other sources in listeners and computational models
    Speech Communication, vol. 35 no. 3-4, Oct. 2001, pp. 141-177. (37pp)

  32. D. Ellis (1998)
    Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis, and its application to speech/nonspeech mixtures
    Speech Communication special issue on Computational Auditory Scene Analysis, M. Cooke & H. Okuno, eds., vol. 27 no. 3-4, April 1999, pp. 281-298. (11pp)

Book chapters

  1. D. Ellis (2008)
    An introduction to signal processing for speech
    to appear as a chapter in The Handbook of Phonetic Science, 2nd ed., ed. Hardcastle and Laver.

  2. D. Ellis (2006)
    Model-Based Scene Analysis
    Chapter 4 of Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D. Wang & G. Brown, eds., Wiley/IEEE Press, pp. 115-146, 2006. (46pp)

  3. D. Ellis (2006)
    Modeling the auditory component of speech
    Chapter 24 of Listening to speech: An auditory perspective, S. Greenberg & W. Ainsworth, eds., Lawrence Erlbaum, pp. 393-307, 2006. (13pp)

  4. D. Ellis (2004)
    Evaluating Speech Separation Systems
    Chapter 20 in Speech Separation by Humans and Machines, ed. P. Divenyi, Kluwer, pp. 295-304. (12 pp)

  5. D. Ellis and D. Rosenthal (1998)
    Mid-level representations for Computational Auditory Scene Analysis: The Weft Element
    Chapter 17 in Computational auditory scene analysis, D. F. Rosenthal and H. Okuno, eds., Lawrence Erlbaum, pp. 257-272, 1998. (7pp)
    (also appeared in Proc. Intl. Joint Conf. on Artif. Intell. Workshop on Computational Auditory Scene Analysis, Montreal, August 1995.)

Theses

  1. D. Ellis (1996)
    Prediction-driven computational auditory scene analysis
    Ph.D. thesis, Dept. of Elec. Eng & Comp. Sci., M.I.T., June 1996. (180pp)

  2. D. Ellis (1992)
    A Perceptual Representation of Audio
    Master's thesis, EECS dept, MIT, February 1992. (88pp)

International Conferences (refereed)

  1. Z. Chen, B. McFee, D. Ellis (2014)
    Speech enhancement by low-rank and convolutive dictionary spectrogram decomposition
    Proc. Interspeech,(to appear), Singapore, Sep 2014.

  2. D. Ellis and H. Satoh and Z. Chen (2014)
    Detecting proximity from personal audio recordings
    Proc. Interspeech,(to appear), Singapore, Sep 2014.

  3. Colin Raffel and Brian McFee and Eric J. Humphrey and Justin Salamon and Oriol Nieto and Dawen Liang and Daniel P. W. Ellis (2014)
    mir_eval: A Transparent Implementation of Common MIR Metrics
    Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.

  4. B. McFee, D. Ellis (2014)
    Analyzing Song Structure With Spectral Clustering
    Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.

  5. D. Liang, J. Paisley, D. Ellis (2014)
    Codebook-based Scalable Music Tagging With Poisson Matrix Factorization
    Proc. ISMIR, (to appear), Taipei, Taiwan, Oct 2014.

  6. H. Papadopoulos, D. Ellis (2014)
    Music-content-adaptive robust principal component analysis for a semantically consistent separation of foreground and background in music audio signals
    Proc. DAFx, (to appear), Erlangen, Sep 2014.

  7. Z. Chen, H. Papadopoulos, D. Ellis(2014)
    Content-adaptive speech enhancement by a sparsely-activated dictionary plus low rank decomposition
    Proc. HSCMA, Nancy, May 2014.

  8. D. Liang, D. Ellis, M. Hoffman, G. Mysore (2014)
    Speech Decoloration Based On The Product-Of-Filters Model
    Proc. ICASSP, 2400-2404, Florence, May 2014.
    DOI: 10.1109/ICASSP.2014.6854030

  9. B. McFee, D. Ellis (2014)
    Better Beat Tracking Through Robust Onset Aggregation
    Proc. ICASSP, 2154--2158, Florence, May 2014.
    DOI: 10.1109/ICASSP.2014.6853980

  10. B. McFee, D. Ellis (2014)
    Learning To Segment Songs With Ordinal Linear Discriminant Analysis
    Proc. ICASSP, 5197--5201, Florence, May 2014.
    DOI: 10.1109/ICASSP.2014.6854594

  11. M. McVicar, D. Ellis, M. Goto (2014)
    Leveraging Repeated Utterances for Improved Transcription of Chorus Lyrics from Sung Audio
    Proc. ICASSP, 3117-3121, Florence, May 2014.
    DOI: 10.1109/ICASSP.2014.6854174

  12. C. Raffel, D. Ellis (2014)
    Estimating Timing and Channel Distortion Across Related Signals
    Proc. ICASSP, 654-658, Florence, May 2014.
    DOI: 10.1109/ICASSP.2014.6853677

  13. D. Silva, V. de Souza, G. Batista, E. Keogh, D. Ellis (2013)
    Applying Machine Learning and Audio Analysis Techniques to Insect Recognition in Intelligent Traps
    Proc. ICMLA, (to appear), Miami, December 2013.

  14. D. Liang, M. Hoffman, D. Ellis (2013)
    Beta Process Sparse Nonnegative Matrix Factorization For Music
    Proc. ISMIR, (to appear), Curitiba, November 2013.

  15. D. Silva, H. Papadopoulos, G. Batista, D. Ellis (2013)
    A Video Compression-Based Approach To Measure Music Structure Similarity
    Proc. ISMIR, (to appear), Curitiba, November 2013.

  16. Z. Chen and D. Ellis (2013)
    Speech Enhancement By Sparse, Low-Rank, And Dictionary Spectrogram Decomposition
    Proc. IEEE WASPAA, (to appear), Mohonk, October 2013.

  17. D. Gillespie and D. Ellis (2013)
    Modeling nonlinear circuits with linearized dynamical models via kernel regression
    Proc. IEEE WASPAA, (to appear), Mohonk, October 2013.

  18. M. Graciarena, A. Alwan, D. Ellis, H.Franco, L. Ferrer, J. Hansen, A. Janin, B.-S. Lee, Y. Lei, V. Mitra, N. Morgan, S. O. Sadjadi, T.J. Tsai, N. Scheffer, L. N. Tan, B. Williams (2013)
    All for One: Feature Combination for Highly Channel-Degraded Speech Activity Detection
    Proc. Interspeech, Lyon, August 2013, paper 1338.

  19. C. Cotton and D. Ellis (2013)
    Subband Autocorrelation Features for Video Soundtrack Classification
    Proc. ICASSP-13, Vancouver, May 2013, 8663-8666.

  20. K. Su, M. Naaman, A. Gurjar, M. Patel, and D. Ellis (2012)
    Making a Scene: Alignment of Complete Sets of Clips based on Pairwise Audio Match
    Proc. ICMR-12, Hong Kong, June 2012, 26-33.

  21. B.-S. Lee and D. Ellis (2012)
    Noise Robust Pitch Tracking by Subband Autocorrelation Classification
    Proc. Interspeech-12, Portland, September 2012, paper P3b.05.

  22. J. McDermott, D. Ellis, H. Kawahara (2012)
    Inharmonic Speech: A Tool for the Study of Speech Perception and Separation
    Proc. SAPA-SCALE 2012, Portland, September 2012, 114-117.

  23. T. Bertin-Mahieux and D. Ellis (2012)
    Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude
    Proc. ISMIR-12, Porto, October 2012, 241-246.

  24. B. McFee, T. Bertin-Mahieux, D. Ellis, and G. Lanckriet (2012)
    The Million Song Dataset Challenge
    Proc. WWW-2012 AdMIRe Workshop, Lyon, April 2012, 909-916.

  25. T. Bertin-Mahieux, D. Ellis, B. Whitman, and P. Lamere (2011)
    The Million Song Dataset
    Proc. ISMIR pp. 591-596, Miami, October 2011.

  26. D. Ellis, B. Whitman, and A. Porter (2011)
    Echoprint - An Open Music Identification Service
    Proc. ISMIR, late-breaking session, Miami, October 2011.

  27. T. Bertin-Mahieux and D. Ellis (2011)
    Large-Scale Cover Song Recognition Using Hashed Chroma Landmarks
    Proc. IEEE WASPAA, pp. 117-120, Mohonk, October 2011.

  28. C. Cotton and D. Ellis (2011)
    Spectral vs. Spectro-Temporal Features for Acoustic Event Detection
    Proc. IEEE WASPAA, pp. 69-72, Mohonk, October 2011.

  29. T. Bertin-Mahieux, G. Grindlay, R. Weiss, and D. Ellis (2011)
    Evaluating music sequence models through missing data
    Proc. IEEE ICASSP, pp. 177-180, Prague, May 2011.

  30. C. Cotton, D. Ellis , and A. Loui (2011)
    Soundtrack classification by transient events
    Proc. IEEE ICASSP, pp. 473-476, Prague, May 2011.

  31. D. Ellis, X. Zheng, and J. McDermott (2011)
    Classifying soundtracks with audio texture features
    Proc. IEEE ICASSP, pp. 5880-5883, Prague, May 2011.

  32. C. Vezyrtzis, A. Klein, D. Ellis, Y. Tsividis (2011)
    Direct Processing of MPEG Audio Using Companding and BFP Techniques
    Proc. IEEE ICASSP, pp. 361-364, Prague, May 2011.

  33. Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui (2011)
    Consumer Video Understanding: A Benchmark Database and An Evaluation of Human and Machine Performance
    Proc. ACM ICMR, article #29, Trento, Apr 2011.

  34. G. Grindlay and D. Ellis (2010)
    A Probabilistic Subspace Model for Multi-Instrument Polyphonic Transcription
    Proc. ISMIR, pp. 21-26, Utrecht, August 2010.

  35. T. Bertin-Mahieux, R. Weiss, and D. Ellis (2010)
    Clustering beat-chroma patterns in a large music database
    Proc. ISMIR, pp. 111-116, Utrecht, August 2010.

  36. D. Ellis, B. Whitman, T. Jehan, and P. Lamere (2010)
    The Echo Nest Musical Fingerprint
    ISMIR Late Breaking Abstracts, Utrecht, August 2010.

  37. D. Ellis and A. Weller (2010)
    The 2010 LabROSA chord recognition system
    MIREX 2010 system abstracts, August 2010.

  38. S. Ravuri and D. Ellis (2010)
    Cover Song Detection: From High Scores to General Classification
    Proc. IEEE ICASSP, pp. 65-68, Dallas, March 2010.

  39. C. Cotton and D. Ellis (2010)
    Audio Fingerprinting to Identify Multiple Videos of an Event
    Proc. IEEE ICASSP, pp. 2386-2389, Dallas, March 2010.

  40. K. Lee, D. Ellis, and A. Loui (2010)
    Detecting Local Semantic Concepts in Environmental Sounds using Markov Model based Clustering
    Proc. IEEE ICASSP, pp. 2278-2281, Dallas, March 2010.

  41. A. Weller, D. Ellis, and T. Jebara (2009)
    Structured Prediction Models for Chord Transcription of Music Audio
    Proc. ICMLA, Miami Beach FL, December 2009, pp. 590-595.

  42. C. Cotton and D. Ellis (2009)
    Finding Similar Acoustic Events using Matching Pursuit and Locality-Sensitive Hashing
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 125-128.

  43. C. Smit and D. Ellis (2009)
    Guided Harmonic Sinusoid Estimation in a Multi-Pitch Environment
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 41-44.

  44. G. Grindlay and D. Ellis (2009)
    Multi-Voice Polyphonic Music Transcription Using Eigeninstruments
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 53-56.

  45. J. Devaney, M. Mandel, and D. Ellis (2009)
    Improving Midi-Audio Alignment with Acoustic Features
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 45-48.

  46. M. Mandel and D. Ellis (2009)
    The Ideal Interaural Parameter Mask: A Bound on Binaural Separation Systems
    Proc. WASPAA-09, Mohonk NY, October 2009, pp. 85--88.

  47. W. Jiang, C. Cotton, S.-F. Chang, D. Ellis, and A. Loui (2009)
    Short-Term Audio-Visual Atoms for Generic Video Concept Classification
    Proc. ACM MultiMedia-09, Beijing, October 2009, pp. 5-14.

  48. J. Gudnason, M. Thomas, P. Naylor, and D. Ellis (2009)
    Voice Source Waveform Analysis and Synthesis using Principal Component Analysis and Gaussian Mixture Modelling
    Proc. Interspeech-09, Brighton, September 2009, pp. 108-111.

  49. J. Devaney and D. Ellis (2009)
    Handling Asynchrony in Audio-Score Alignment
    Proc. ICMC-09, Montreal, pp. 29-32, August 2009.

  50. J. B. Boldt and D. Ellis (2009)
    A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation
    Proc. EUSIPCO'09, Glasgow, August 2009, pp. 1849-1853.

  51. R. Weiss and D. Ellis (2009)
    A Variational EM Algorithm for Learning Eigenvoice Parameters in Mixed Signals
    Proc. ICASSP-09, pp. 113-116, Taiwan, April 2009.

  52. M. Mandel and D. Ellis (2008)
    Multiple-Instance Learning For Music Information Retrieval
    Proc. ISMIR 2008, pp. 577-582, Philadelphia, September 2008.

  53. R. Weiss, M. Mandel, D. Ellis (2008)
    Source Separation Based on Binaural Cues and Source Model Constraints
    Proc. Interspeech-08, pp. 419-422, Brisbane, Australia, September 2008.

  54. K. Hu, P. Divenyi, D. Ellis, Z. Jin, B. Shinn-Cunningham, D. Wang (2008)
    Preliminary Intelligibility Tests of a Monaural Speech Segregation System
    Proc. SAPA-08, pp. 11-16, Brisbane, Australia, September 2008.

  55. A. Lammert, D. Ellis, P. Divenyi (2008)
    Data-driven articulatory inversion incorporating articulator priors
    Proc. SAPA-08, pp. 29-34, Brisbane, Australia, September 2008.

  56. S. Ravuri and D. Ellis (2008)
    Stylization of Pitch with Syllable-Based Linear Segments
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 3985-3988.

  57. D. Ellis, C. Cotton, and M. Mandel (2008)
    Cross-Correlation of Beat-Synchronous Representations for Music Similarity
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 57-60.
    See also the talk slides.

  58. J. H. Jensen, M. G. Christensen, D. Ellis, and S. H. Jensen (2008)
    A Tempo-Insensitive Distance Measure for Cover Song Identification based on Chroma Features
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 2209-2212.

  59. K. Lee and D. Ellis (2008)
    Detecting Music in Ambient Audio by Long-Window Autocorrelation
    Proc. ICASSP-08 Las Vegas, April 2008, pp. 9-12.

  60. M. Mandel and D. Ellis (2007)
    EM localization and separation using interaural level and phase cues
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 275-278.

  61. R. Weiss and D. Ellis (2007)
    Monaural speech separation using source-adapted models
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 114-117.

  62. C. Smit and D. Ellis (2007)
    Solo voice detection via optimal cancelation
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 207-210.

  63. G. Poliner and D. Ellis (2007)
    Improving generalization for polyphonic piano transcription
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-07, Mohonk NY, October 2007, pp. 86-89.

  64. S.-F. Chang, D. Ellis, W. Jiang, K. Lee, A. Yanagawa, A. Loui, J. Luo (2007)
    Large-scale multimodal semantic concept detection for consumer video
    Multimedia Information Retrieval workshop, ACM Multimedia Augsburg, Germany, Sep 2007, pp. 255-264.
    DOI: 10.1145/1290082.1290118

  65. D. Ellis (2007)
    Classifying Music Audio with Timbral and Chroma Features
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 339-340.
    (See also the poster I presented at ISMIR-07.)

  66. M. Mandel and D. Ellis (2007)
    A Web-Based Game for Collecting Music Metadata
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 365-366.
    (See also the 6 page tech. report.)

  67. J. H. Jensen, D. Ellis, M. G. Christensen, S. H. Jensen (2007)
    Evaluation Distance Measures Between Gaussian Mixture Models of MFCCs
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-07 Vienna, Austria, pp. 107-108.

  68. D. Ellis and C. Cotton (2007)
    The 2007 LabROSA Cover Song Detection System
    MIREX 2007 Audio Cover Song Evaluation system description, Sep 2007. (4pp)
    (See also the poster I presented at ISMIR-07.)

  69. A. Doherty, A. Smeaton, K.-S. Lee, and D. Ellis (2007)
    Multimodal Segmentation of Lifelog Data
    Proc. 8th Int. Conf. on Computer-Assisted Information Retrieval RIAO 2007, Pittsburgh, May 2007. (18pp)

  70. J. Ogle and D. Ellis (2007)
    Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings
    Proc. ICASSP-07 Hawai'i, pp.I-233-236. (4pp)

  71. D. Ellis and G. Poliner (2007)
    Identifying Cover Songs With Chroma Features and Dynamic Programming Beat Tracking
    Proc. ICASSP-07 Hawai'i, pp. IV-1429-1432. (4pp)

  72. M. Mandel, D. Ellis, and T. Jebara (2006)
    An EM algorithm for localizing multiple sound sources in reverberant environments
    Advances Neural Info. Proc. Sys. 19, Vancouver CA, Dec 2006, pp. 953-960. (8pp)

  73. K. Lee and D. Ellis (2006)
    Voice Activity Detection in Personal Audio Recordings Using Autocorrelogram Compensation
    Interspeech ICSLP-06, pp. 1970-1973, Pittsburgh, Oct 2006. (4pp)

  74. M. Mandel and D. Ellis (2006)
    A probability model for interaural phase difference
    Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 1-6, Pittsburgh PA, Oct 2006. (6pp)

  75. R. Weiss and D. Ellis (2006)
    Estimating single-channel source separation masks: Relevance Vector Machine classifiers vs. pitch-based masking
    Proc. Workshop on Statistical and Perceptual Audition SAPA-06, pp. 31-36, Pittsburgh PA, Oct 2006. (6pp)

  76. D. Ellis (2006)
    Identifying `Cover Songs' with Beat-Synchronous Chroma Features
    MIREX 2006 Audio Cover Song Contest system description, Sep 2006. (4pp)

  77. D. Ellis (2006)
    Beat Tracking with Dynamic Programming
    MIREX 2006 Audio Beat Tracking Contest system description, Sep 2006. (3pp)

  78. D. Ellis and R. Weiss (2006)
    Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation
    Proc. ICASSP-06, Toulouse, May 2006, pp. V-957-960. (4pp)

  79. X. Halkias and D. Ellis (2006)
    Estimating the Number of Marine Mammals using Recordings of Clicks from One Microphone
    Proc. ICASSP-06, Toulouse, May 2006, pp. V-769-772. (4pp).

  80. M. Reyes-Gomez, N. Jojic, and D. Ellis (2005)
    Deformable Spectrograms
    AI & Statistics 2005, Barbados, Jan 2005 pp. 285-292. (8pp)

  81. G. Poliner, D. Ellis (2005)
    A Classification Approach to Melody Transcription
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.161-166. (6pp)

  82. M. Mandel, D. Ellis (2005)
    Song-Level Features and Support Vector Machines for Music Classification
    Proc. Int. Conf. on Music Info. Retrieval ISMIR-05, London, September 2005, pp.594-599. (6pp)

  83. K. Dobson, B. Whitman, D. Ellis (2005)
    Learning Auditory Models of Machine Voices
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio WASPAA-05, Mohonk NY, October 2005, pp. 339-342. (4pp)

  84. C.-P. Chen, J. Bilmes, D. Ellis (2005)
    Speech Feature Smoothing for Robust ASR
    Proc. ICASSP-05, Philadelphia, March 2005, pp. I-525-528. (4pp)

  85. N. Lesser, D. Ellis (2005)
    Clap Detection and Discrimination for Rhythm Therapy
    Proc. ICASSP-05, Philadelphia, March 2005, pp. III-37-40. (4pp)
    (See also the talk slides which describe an energy ratio feature that does much better than the ones described in the paper.)

  86. M. Athineos, H. Hermansky and D. Ellis (2004)
    LP-TRAP: Linear predictive temporal patterns
    International Conference on Spoken Language Processing ICSLP-04, Jeju, Korea, Oct 2004, pp. 949-952. (4pp)

  87. M. Athineos, H. Hermansky and D. Ellis (2004)
    PLP^2: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns
    ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 37-42. (5pp)

  88. M. Reyes-Gomez, N. Jojic, and D. Ellis (2004)
    Towards single-channel unsupervised source separation of speech mixtures: The layered harmonics/formants separation-tracking model
    ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 25-30. (6pp)

  89. D. Ellis and J. Liu (2004)
    Speaker turn segmentation based on between-channel differences
    NIST Meeting Recognition Workshop @ ICASSP, pp. 112-117, Montreal, May 2004. (6pp)

  90. L. Kennedy and D. Ellis (2004)
    Laughter Detection in Meetings
    NIST Meeting Recognition Workshop @ ICASSP, pp. 118-121, Montreal, May 2004. (4pp)

  91. M.J. Reyes-Gomez, D. Ellis, N. Jojic (2004)
    Multiband Audio Modeling for Single Channel Acoustic Source Separation
    Proc. ICASSP-04, pp. V-641-644, Montreal, May 2004. (4pp)

  92. M.J. Reyes-Gomez, N. Jojic, D. Ellis (2004)
    Detailed graphical models for source separation and missing data interpolation in audio
    Snowbird Learning Workshop, Snowbird, 2004. (2pp)

  93. D. Ellis and J. Arroyo (2004)
    Eigenrhythms: Drum pattern basis sets for classification and generation
    International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 554-559. (6pp)
    (longer tech report version with color figures)

  94. B. Whitman and D. Ellis (2004)
    Automatic Record Reviews
    International Symposium on Music Information Retrieval ISMIR-04, Barcelona, Oct 2004, pp. 470-477. (8pp)

  95. D. Ellis and K.S. Lee (2004)
    Minimal-Impact Audio-Based Personal Archives
    First ACM workshop on Continuous Archiving and Recording of Personal Experiences CARPE-04, New York, Oct 2004, pp. 39-47. (9pp)

  96. D. Ellis and K.S. Lee (2004)
    Features for Segmenting and Classifying Long-Duration Recordings of Personal Audio
    ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing SAPA-04, Jeju, Korea, Oct 2004, pp. 1-6. (6pp)

  97. L. Kennedy and D. Ellis (2003)
    Pitch-based emphasis detection for characterization of meeting recordings
    Automatic Speech Recognition and Understanding Workshop IEEE ASRU 2003, pp. 243-248, St. Thomas, December 2003. (6pp)

  98. M. Athineos and D. Ellis (2003)
    Frequency-domain linear prediction for temporal features
    Automatic Speech Recognition and Understanding Workshop IEEE ASRU 2003, pp. 261-266, St. Thomas, December 2003. (6pp)

  99. M.J. Reyes-Gomez, B. Raj, D. Ellis (2003)
    Multi-channel Source Separation by Beamforming Trained with Factorial HMMs
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, pp. 13-16, Mohonk NY, October 2003. (4pp)

  100. P. Scanlon, D. Ellis, R. Reilly (2003)
    Using Mutual Information to design class-specific phone recognizers
    Proc. Eurospeech-03, Geneva, September 2003, pp. 857-860. (4pp)

  101. S. Renals and D. Ellis (2003)
    Audio Information Access from Meeting Rooms
    Proc. ICASSP-03, Hong Kong, April 2003, pp. IV-744--747. (4pp)

  102. M.J. Reyes-Gomez, B. Raj, D. Ellis (2003)
    Multi-channel Source Separation by Factorial HMMs
    Proc. ICASSP-03, Hong Kong, April 2003, pp. I-664--667. (4pp)

  103. A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, C. Wooters (2003)
    The ICSI Meeting Corpus
    Proc. ICASSP-03, Hong Kong, April 2003. pp. I-364--367. (4pp)

  104. M. Athineos and D. Ellis (2003)
    Sound Texture Modelling with Linear Prediction in both Time and Frequency Domains
    Proc. ICASSP-03, Hong Kong, April 2003, pp. V-648--651. (4pp)

  105. A. Sheh and D. Ellis (2003)
    Chord Segmentation and Recognition using EM-Trained Hidden Markov Models
    4th International Symposium on Music Information Retrieval ISMIR-03, pp. 185-191, Baltimore, October 2003. (7pp)

  106. R. Turetsky and D. Ellis (2003)
    Ground-Truth Transcriptions of Real Music from Force-Aligned MIDI Syntheses
    4th International Symposium on Music Information Retrieval ISMIR-03, pp. 135-141, Baltimore, October 2003. (7pp)

  107. A. Berenzweig, B. Logan, D. Ellis, B. Whitman (2003)
    A large-scale evaluation of acoustic and subjective music similarity measures
    4th International Symposium on Music Information Retrieval ISMIR-03, pp. 103-109, Baltimore, October 2003. (7pp)

  108. B. Logan, D. Ellis, A. Berenzweig (2003)
    Toward evaluation techniques for music similarity
    Keynote address, Workshop on the Evaluation of Music Information Retrieval (MIR) Systems at SIGIR 2003, Toronto, August 2003. (5pp)

  109. A. Berenzweig, D. Ellis & S. Lawrence (2003)
    Anchor Space for Classification and Similarity Measurement of Music
    Proc. ICME-03, Baltimore, July 2003, pp. I-29--32. (4pp)

  110. M.J. Reyes-Gomez and D. Ellis (2003)
    Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modeling
    Proc. ICME-03, Baltimore, July 2003, pp. I-73--76. (4pp)

  111. M.J. Reyes-Gomez and D. Ellis (2002)
    Error visualization for tandem acoustic modeling on the Aurora task
    ICASSP-02 (student session), Orlando, May 2002. (4pp)

  112. D. Ellis, B. Whitman, A. Berenzweig, S. Lawrence (2002)
    The Quest for Ground Truth in Musical Artist Similarity
    Proc. ISMIR-02, pp. 170-177, Paris, October 2002. (8pp)

  113. A. Berenzweig, D. Ellis, S. Lawrence (2002)
    Using Voice Segments to Improve Artist Classification of Music
    Proc. AES-22 Intl. Conf. on Virt., Synth., and Ent. Audio. Espoo, Finland, June 2002. (8pp)

  114. T. Pfau, D. Ellis, A. Stolcke (2001)
    Multispeaker Speech Activity Detection for the ICSI Meeting Recorder
    Proc. ASRU-01, Italy, December 2001. (4pp)

  115. J. Barker, M. Cooke, D. Ellis (2001)
    Integrating bottom-up and top-down constraints to achieve robust ASR: The multisource decoder
    Presented at the CRAC workshop, pp. 63-66, Aalborg, Denmark, September 2001. (4pp)

  116. D. Ellis and M.J. Reyes Gomez (2001)
    Investigations into Tandem Acoustic Modeling for the Aurora Task
    Proc. Eurospeech-01, Special Event on Noise Robust Recognition, pp. 189-192, Denmark, September 2001. (4pp)
    (See also the poster I presented at the conference.)

  117. D. Ellis, R. Singh, S. Sivadas (2001)
    Tandem acoustic modeling in large-vocabulary recognition
    Proc. ICASSP-2001, pp. I-517-520, Salt Lake City, May 2001. (4pp)
    (See also the poster I presented at the conference.)

  118. N. Morgan, D. Baron, J. Edwards, D. Ellis, D. Gelbart, A. Janin, T. Pfau, E. Shriberg, A. Stolcke (2001)
    The Meeting Project at ICSI
    Human Language Technologies Conference, San Diego, March 2001, pp. 246-252. (7pp)

  119. A.L. Berenzweig and D. Ellis (2001)
    Locating Singing Voice Segments within Music Signals
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, pp. 119-122, Mohonk NY, October 2001. (4pp)

  120. D. Ellis (2001)
    Detecting Alarm Sounds
    Presented at the CRAC workshop, pp. 59-62, Aalborg, Denmark, September 2001. (4pp)
    (See also the poster I presented at the workshop.)

  121. D. Ellis and J.A. Bilmes (2000)
    Using mutual information to design feature combinations
    Proc. ICSLP-2000, Beijing, October 2000. (4pp)

  122. J. Barker, M. Cooke and D. Ellis (2000)
    Decoding speech in the presence of other sound sources
    Proc. ICSLP-2000, Beijing, October 2000. (4pp)

  123. J. Ferreiros-Lopez and D. Ellis (2000)
    Using acoustic condition clustering to improve acoustic change detection on Broadcast News
    Proc. ICSLP-2000, Beijing, October 2000. (4pp)

  124. D. Ellis (2000)
    Improved recognition by combining different features and different systems
    Proc. AVIOS-2000, San Jose, May 2000. (7pp)

  125. H. Hermansky, D. Ellis and S. Sharma (2000)
    Tandem connectionist feature stream extraction for conventional HMM systems
    Proc. ICASSP-2000, Istanbul, III-1635-1638. (4pp)
    (See also the poster I presented at the conference.)

  126. S. Sharma, D. Ellis, S. Kajarekar, P. Jain and H. Hermansky (2000)
    Feature extraction using non-linear transformation for robust speech recognition on the Aurora database
    Proc. ICASSP-2000, Istanbul, II-1117-1120. (4pp)

  127. D. Genoud, D. Ellis and N. Morgan (1999)
    Combined speech and speaker recognition with speaker-adapted connectionist models
    Proc. Auto. Speech Recog. & Understanding Workshop, Keystone. (4pp)

  128. D. Abberley, S. Renals, T. Robinson and D. Ellis (1999)
    The THISL SDR system at TREC-8
    Proc. Text Retrieval Conference 8, Washington. (6pp)

  129. G. Williams and D. Ellis (1999)
    Speech/music discrimination based on posterior probability features
    Proc. Eurospeech-99, Budapest. (4 pp)

  130. A. Janin, D. Ellis and N. Morgan (1999)
    Multi-stream speech recognition: Ready for prime time?
    Proc. Eurospeech-99, Budapest. (4 pp)

  131. D. Ellis and N. Morgan (1999)
    Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition
    Proc. ICASSP-99, Phoenix. (4 pp)

  132. N. Morgan, D. Ellis, E. Fosler-Lussier, A. Janin and B. Kingsbury (1999)
    Reducing errors by increasing the error rate: MLP Acoustic Modeling for Broadcast News Transcription
    Presented at the DARPA Broadcast News Transcription and Understanding Workshop, Gaithersburg VA, 1999feb28. (4pp)

  133. G. Cook, J. Christie, D. Ellis, E. Fosler-Lussier, Y. Gotoh, B. Kingsbury, N. Morgan, S. Renals, T. Robinson and G. Williams (1999)
    The SPRACH System for the Transcription of Broadcast News
    Presented at the DARPA Broadcast News Transcription and Understanding Workshop, Gaithersburg VA, 1999feb28. (4pp)

  134. D. Ellis (1997)
    The Weft: A representation for periodic sounds
    Proc. Int. Conf. on Acous., Speech & Sig. Proc. ICASSP-97, Munich, vol. 2 pp. 1307-1310, April 1997. (4pp)
    (See also the poster I presented at the conference.)

  135. D. Ellis (1997)
    Computational Auditory Scene Analysis exploiting Speech-Recognition knowledge
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1997. (4pp)

  136. D. Ellis (1996)
    Prediction-driven computational auditory scene analysis for dense sound mixtures
    Proc. ESCA Workshop on the Auditory Basis of Speech Perception, Keele, July 1996. (6pp)

  137. D. Ellis (1995)
    Underconstrained stochastic representations for top-down computational auditory scene analysis
    Proc. IEEE Workshop on Apps. of Sig. Proc. to Acous. and Audio, Mohonk, October 1995. (4pp)

  138. D. Ellis (1994)
    A computer implementation of psychoacoustic grouping rules
    Proc. 12th Intl. Conf. on Pattern Recognition, Jerusalem, October 1994. (9pp)

  139. D. Ellis (1993)
    Hierarchic models of sound for separation and restoration
    Proc. 1993 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio, October 1993. (4pp)


Valid HTML 4.01! Last updated: $Date: 2006/06/09 01:47:12 $
Dan Ellis <[email protected]>