This is everything I've contributed to the worlds of computer science and electrical engineering so far. There's some code, papers, and demos. It should be noted that all of this stuff is released under the Gnu General Public License v3. I may need to put something about that in the files themselves, but for now I trust you.
I started a company with Marios Athineos and Graham Poliner and the website I built for it is Major Miner's music labeling game. From the intro:
The goal of the game, besides just listening to music, is to label songs with original, yet relevant words and phrases that other players agree with. We're going to use your descriptions to teach our computers to recommend music that sounds like the music you already like.
Players are having a good time with it. You can see the top scorers on the leader board, but don't let them intimidate you, it's pretty easy to score points once you get the hang of it. Check it out if you have some time to play.
Dan Ellis, Tony Jebara, and I have been working on the problem of sound source localization. We take as our starting point binaural (two-microphone), reverberant recordings of one, two, or three simultaneous speakers. From these recordings, our system determines the direction from which the sounds are arriving and separates the speakers from one another as best it can.
I presented some observations that underly the work at the SAPA workshop in September of 2006, and then the full model and EM algorithm at NIPS in December of 2006, and a poster on possible extensions to the work at the AMAP workshop at NIPS also in December of 2006.
Related stuff on my webpage:
Graham Poliner, Dan Ellis, and I have been working on the problem of playlist generation. This work went into our two publications, the first in the ACM Multimedia Systems Journal and the second as ISMIR 2005. The systems use SVM active learning to try to determine what you want to listen to. Take a look at the demo I put together for it.
In addition to the papers, a system based on this idea came in first place in the MIREX 2005 Artist identification competition at ISMIR and second place in the Genre identification competition.
Related stuff on my webpage:
Here are some final projects from classes I've taken here at Columbia. Maybe you want to see the full list of classes I've taken.
For my final project in Tony Jebara's Machine Learning course, cs4771, I implemented Carl Rasmussen's Infinite Gaussian Mixture Model. I got it working for both univariate and multivariate data. I'd like to see what it does when presented with MFCC frames from music and audio. There were some tricky parts of implementing it, I wrote them up in a short paper describing my implementation. Since I've gotten the multivariate case working, I'll trust you to ignore all statements to the contrary in the paper. The IGMM requires Adaptive Rejection Sampling to sample the posteriors of some of its parameters, so I implemented that as well.
Download related pieces:
For Professor Shih-Fu Chang 's course , Graham Poliner and I put together a music retrieval system. It used active SVM learning (a form of relevance feedback) on Fisher kernel features to try to recommend similar songs to those the user has tagged as relevant, while avoiding those the user has tagged as irrelevant. We're still planning on trying out different features, classifiers, song databases, and ground truth, i.e. "this work is just preliminary."
Here's the abstract:
In order to manage growing music collections, a personal music recommender could find new music, appropriate to the user's mood, that he or she would like to listen to. This paper approaches these goals using the flexible search technique of active SVM learning that adapts to users' perceptions instead of vice versa. In the best case, active SVM learning requires fewer than half the number of training examples a normal SVM classification would require to achieve the same precision and recall. In addition to the idea of applying active SVM learning to the audio domain, the paper has contributed a collection of ground truth classification of popular songs and a preliminary software implementation of this recommender.
"I'm going to update this any day now..." but you can download some related pieces here:
For Dan Ellis' Digital Signal Processing class, I did some work on audio fingerprinting, with an eye towards using it as a means for measuring the similarity of sounds. I'm going to keep working on this, but maybe from a different angle.
Here's the abstract:
Shazam's audio features, consisting of pairs of spectral peaks with their associated difference in time, form a useful representation for identifying identical audio clips in the presence of noise and distortion. This project implements a shazam feature extractor and attempts to generalize it from the very specific identity detector to a less specific auditory similarity measure. This generalization unfortunately did not meet with much success, but we have created a number of reduced-data songs from the shazam representation that are still recognizeable even with no additional information from the original song.
Some related pieces: