Department of Electrical Engineering - Columbia University

[SEAS logo]

ELEN E4896 - Spring 2016

MUSIC SIGNAL PROCESSING

Home page

Course outline

Code

Practicals

Assignments

Columbia Courseworks E4896

Assignments

This page presents the mini-project assignments, as they are assigned.


Mini Project 1: Analog Synthesizer Emulation

Assigned: Wed 2016-02-10
Due: Wed 2016-02-24

The goal of this miniproject is essentially to complete what we tried to do in the analog synthesizer practical - come up with a good emulation of an analog synth sound using Pd.

You should choose one of the sounds either from the Loomer Aspect demo or the Juno-106 examples page, and work to duplicate that sound with your Pd patch. You might want to look at the spectrogram and/or waveforms of the original sounds to figure out what's going on, and to validate your reproduction. Working with an actual voice from the Aspect synth will allow you to actually see how it's put together within that synthesizer, although there's still some detective work to be done to figure out exactly what it does internally.

The project is due by 5pm on Wednesday 2016-02-24. You should submit a report of 3 pages or so (as a PDF, please) explaining what you did and how it worked. You should also submit your patches.


Mini Project 2: Note Detection

Assigned: Wed 2016-03-09
Due: Sun 2016-04-03

This miniproject follows on from the pitch tracking and autotuning practicals. Your assignment is to investigate using sigmund to provide accompaniment or transformation of an existing signal.

You can make any enhancements or modifications you like, but here are some possible ideas:

  • Investigate the pitch tracking accuracy of sigmund for a particular class of source such as voice or saxophone. What are the cases where it makes the most errors? Can you do anything like applying filtering that can improve this?
  • Try to improve the melody-tracking mode of sigmund by tuning the parameters. For a given musical example, how many of the notes is sigmund able to extract, and what is its error rate? What can you do to affect or improve this?
  • Investigate the sinusoid "copying" mode of the test_sigmund patch. What kind of signals lead to good resyntheses, and which are harder? Can you tune the parameters, or otherwise improve quality? What happens if you add more oscillators? What is the nature of the noise/distortion? (Describing what you hear, as well as looking at a spectrogram in Matlab, might be useful).
  • Use the note extraction to add pitch quantization (driving a separate synthesizer, or by reconstructing from sinusoids), but make it quantize to notes from a scale ("white notes") rather than every semitone. You could even try different scales (major, minor, pentatonic).
  • Improve the transitions between notes, perhaps including hysteresis in the quantization (so that it won't change note for vibrato).
  • Preserve the vibrato in the original pitch, i.e., small variations around a (corrected) average pitch. This sounds like a good way to make the result sound more "natural", although I haven't actually tried it in practice. You could also introduce artificial vibrato, regardless of the input, although sometimes you might want to "shape" this so it builds up slowly as the note is held.
  • Harmonizing - use one input pitch to synthesize several voices at different pitches. Note, to do this in the most "musical" way, the intervals (e.g., a third) should change depending on which note is being sung (e.g., always singing two "white notes" higher).

Here is the Stevie Wonder example of vibrato that I played in class: superstition.wav.

The project is due by midnight on Sunday 2016-04-03. As before, you should submit a report of 3 pages or so (as a PDF, please) explaining what you did and how it worked. You should also submit your patch or patches.


Mini Project 3: Chord Recognition

Assigned: Wed 2016-04-06
Due: Wed 2016-04-20

This project is an extension of this week's practical. In the practical, you experimented with a trained chord recognition system, evaluated over a number of Beatles tracks for which we have manual ground-truth data. The mini project assignment is to spend some more time trying to improve the final accuracy number by whatever means you like.

Unlike most of the material in the course so far, this is a task with a clear, quantitative evaluation goal. This makes it much easier to tell how you are getting on - you can quickly check whether a change makes things better or not. However, it can also be a distraction - beware of spending too much time tuning parameters to achieve the single, absolute maximum value. It's often more important to step back and think of entirely different places you can make changes.

Things you can try could include the areas mentioned in the practical: looking at applying a compressive nonlinearity to the features (or other normalization -- maybe reducing the magnitude of the single largest value, since this is often dominated by the melody line, rather than the accompanying chord); experimenting with modifying the transition matrix, or the balance between the transition matrix probabilities and the probabilities evaluated by the Gaussians, since it is this balance between local match and sequential constraints that gives the HMM its power (raising the probabilities to a power before finding the Viterbi path may help).

You could also try "HPSS" harmonic-percussive separation as suggested in the LibROSA demo notebook.

For a long time, the only good ground-truth chord label collection was for the Beatles (done by Chris Harte at Queen Mary, London). However, there are now some other collections that have been transcribed in the same way; there are chord labels on the isophonics web site, although they don't distribute audio.

You can also try calculating beat-synchronous chroma features for your own audio using e4896_beat_sync_chroma.ipynb. You won't be able to train from that data (unless you create your own chord label data), but you can recognize the chords in other tracks.

I encourage you to try to analyze the kinds of errors that the current system is making by inspecting the confusion matrix, and perhaps the particularly difficult tracks, and seeing if that will guide the kinds of changes you choose to make.

Here are some related papers:

The project report is due by 5pm on Wednesday, 2016-04-20. Although we are interested to know what accuracies you obtain on the test set, your grade will not be based on this raw measure of achievement. Rather, we will make an assessment of how well you analyzed the situation, how skillfully and originally you approached it, etc.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.
Dan Ellis <[email protected]>
Last updated: Tue Apr 05 11:11:54 PM EDT 2016