Projects

This is everything I've contributed to the worlds of computer science and electrical engineering so far. There's some code, papers, and demos. It should be noted that all of this stuff is released under the Gnu General Public License v3. I may need to put something about that in the files themselves, but for now I trust you.

Publications (BibTex)

Model-based expectation maximization source separation and localization. M. Mandel, R. Weiss., and D. Ellis (in press). IEEE Transactions in Audio, Speech, and Language Processing.
Evaluation of algorithms using games: the case of music annotation. E. Law, K. West, M. Mandel, M. Bay, and S. Downie (to appear). International Symposium on Music Information Retrieval, October 2009.
The ideal interaural parameter mask: a bound on binaural separation systems. M. Mandel and D. Ellis (to appear). IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2009.
Improving MIDI-audio alignment with acoustic features. J. Devaney, M. Mandel, and D. Ellis (to appear). IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2009.
A Web-Based Game for Collecting Music Metadata. M. Mandel and D. Ellis (2008). Journal of New Music Research, 37(2):151-165. [post-print]
Source separation based on binaural cues and source model constraints. R. Weiss, M. Mandel, and D. Ellis. Interspeech, September 2008.
Multiple-instance learning for music information retrieval. M. Mandel and D. Ellis. International Symposium on Music Information Retrieval, September 2008.
Active learning for interactive multimedia retrieval. T. Huang, C. Dagli, S. Rajaram, E. Chang, M. Mandel, G. Poliner, and D. Ellis. (2008). Proceedings of the IEEE, 96(4):648-667.
Cross-correlation of beat-synchronous representations for music similarity. D. Ellis, C. Cotton, and M. Mandel. Proc. ICASSP, April 2008, Pages 57-60, Las Vegas.
EM localization and separation using interaural level and phase cues. M. Mandel and D. Ellis. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2007, Pages 275-278.
A Web-Based Game for Collecting Music Metadata. M. Mandel and D. Ellis. International Symposium on Music Information Retrieval, September 2007, Pages 365-366. [poster]
Building a binaural source separator. M. Mandel, D. Ellis, and T. Jebara. Advances in Models for Acoustic Processing Workshop, December 2006. [poster]
An EM algorithm for localizing multiple sound sources in reverberant environments. M. Mandel, D. Ellis, and T. Jebara. NIPS, December 2006, Pages 953-960. [poster]
A probability model for interaural phase difference. M. Mandel and D. Ellis. Statistical and Perceptual Audition Workshop, September 2006, Pages 1-6. [slides]
Support vector machine active learning for music retrieval. M. Mandel, G. Poliner, and D. Ellis. Multimedia Systems, May 2006, Pages 1-11. [Springer]
Song-level features and support vector machines for music classification. M. Mandel and D. Ellis. ISMIR, September 2005, Pages 594-599. [poster]
Distributed occlusion reasoning for tracking with nonparametric belief propagation. E. Sudderth, M. Mandel, W. Freeman, and A. Willsky. NIPS, December 2004, Pages 1369-1376.
Visual hand tracking using nonparametric belief propagation. E. Sudderth, M. Mandel, W. Freeman, and A. Willsky. Workshop on Generative Model Based Vision, CVPR, June 2004, Pages 189-197.

Major Miner's music labeling game

I started a company with Marios Athineos and Graham Poliner and the website I built for it is Major Miner's music labeling game. From the intro:

The goal of the game, besides just listening to music, is to label songs with original, yet relevant words and phrases that other players agree with. We're going to use your descriptions to teach our computers to recommend music that sounds like the music you already like.

Players are having a good time with it. You can see the top scorers on the leader board, but don't let them intimidate you, it's pretty easy to score points once you get the hang of it. Check it out if you have some time to play.

Graduate Research

Binaural source localization

Dan Ellis, Tony Jebara, and I have been working on the problem of sound source localization. We take as our starting point binaural (two-microphone), reverberant recordings of one, two, or three simultaneous speakers. From these recordings, our system determines the direction from which the sounds are arriving and separates the speakers from one another as best it can.

I presented some observations that underly the work at the SAPA workshop in September of 2006, and then the full model and EM algorithm at NIPS in December of 2006, and a poster on possible extensions to the work at the AMAP workshop at NIPS also in December of 2006.

Related stuff on my webpage:

The paper introducing the data and analyzing it for a single speaker from the SAPA workshop at ICSLP 2006.
The slides I used in the talk at SAPA.
The main paper describing our system from NIPS 2006.
The poster I presented at NIPS.
The poster describing future improvements to the system from the AMAP workshop at NIPS 2006.
The paper describing the addition of interaural level difference cues to the model from WASPAA 2007.

Music Similarity and Playlist Generation

Graham Poliner, Dan Ellis, and I have been working on the problem of playlist generation. This work went into our two publications, the first in the ACM Multimedia Systems Journal and the second as ISMIR 2005. The systems use SVM active learning to try to determine what you want to listen to. Take a look at the demo I put together for it.

In addition to the papers, a system based on this idea came in first place in the MIREX 2005 Artist identification competition at ISMIR and second place in the Genre identification competition.

Related stuff on my webpage:

Demonstration of automatic playlist generation
The paper for the ACM Multimedia Systems Journal, May 2006.
The paper we published at ISMIR 2005.
The extended abstract about the system that won MIREX 2005 Artist ID.
The results of MIREX 2005, including the extended abstracts for all of the competing systems.

Research Interests

Computational Auditory Scene Analysis
Sound source modelling and separation
Sound similarity
Music information retrieval
Analyzing music automatically with computers
Musical similarity
Machine learning, especially Bayesian and nonparametric

Classes

Here are some final projects from classes I've taken here at Columbia. Maybe you want to see the full list of classes I've taken.

The Infinite Gaussian Mixture Model

For my final project in Tony Jebara's Machine Learning course, cs4771, I implemented Carl Rasmussen's Infinite Gaussian Mixture Model. I got it working for both univariate and multivariate data. I'd like to see what it does when presented with MFCC frames from music and audio. There were some tricky parts of implementing it, I wrote them up in a short paper describing my implementation. Since I've gotten the multivariate case working, I'll trust you to ignore all statements to the contrary in the paper. The IGMM requires Adaptive Rejection Sampling to sample the posteriors of some of its parameters, so I implemented that as well.

One sample taken from the igmm on my version of the "spirals" dataset

Download related pieces:

The paper I wrote about implementing it.
My code
Jacob Eisenstein's Dirichlet process mixture model, which adds some cool features to the infinite GMM.

Active SVM Learning for Music Retrieval

For Professor Shih-Fu Chang 's course , Graham Poliner and I put together a music retrieval system. It used active SVM learning (a form of relevance feedback) on Fisher kernel features to try to recommend similar songs to those the user has tagged as relevant, while avoiding those the user has tagged as irrelevant. We're still planning on trying out different features, classifiers, song databases, and ground truth, i.e. "this work is just preliminary."

Here's the abstract:

In order to manage growing music collections, a personal music recommender could find new music, appropriate to the user's mood, that he or she would like to listen to. This paper approaches these goals using the flexible search technique of active SVM learning that adapts to users' perceptions instead of vice versa. In the best case, active SVM learning requires fewer than half the number of training examples a normal SVM classification would require to achieve the same precision and recall. In addition to the idea of applying active SVM learning to the audio domain, the paper has contributed a collection of ground truth classification of popular songs and a preliminary software implementation of this recommender.

"I'm going to update this any day now..." but you can download some related pieces here:

The report itself (1.7 Mb)
A zip of the matlab source and data files (3.1 Mb)
The song snippets the UI plays for the user

Audio Fingerprinting

For Dan Ellis' Digital Signal Processing class, I did some work on audio fingerprinting, with an eye towards using it as a means for measuring the similarity of sounds. I'm going to keep working on this, but maybe from a different angle.

Here's the abstract:

Shazam's audio features, consisting of pairs of spectral peaks with their associated difference in time, form a useful representation for identifying identical audio clips in the presence of noise and distortion. This project implements a shazam feature extractor and attempts to generalize it from the very specific identity detector to a less specific auditory similarity measure. This generalization unfortunately did not meet with much success, but we have created a number of reduced-data songs from the shazam representation that are still recognizeable even with no additional information from the original song.

Some related pieces:

The report (1.5 Mb)
Some example sound files from the paper
The source code

Projects

Publications (BibTex)

Major Miner's music labeling game

Graduate Research

Binaural source localization

Music Similarity and Playlist Generation

Research Interests

Classes

The Infinite Gaussian Mixture Model

Active SVM Learning for Music Retrieval

Audio Fingerprinting

Don't forget to look at my undergrad work