|
|||
|
Practicals |
PracticalsThis page contains descriptions and instructions for weekly practical sessions.
Weds 2013-05-01: UnmixingThis week we will experiment with separating musical sources by using Nonnegative Matrix Factorization (NMF) to decompose a short-time Fourier transform magnitude (i.e., a spectrogram) into a time-frequency mask that can be used to isolate sources. We will be using nmf_kl_sparse_v.m, an implementation of KL-based NMF separation using Virtanen's approach to sparsity from Graham Grindlay's NMFlib. You can download all the pieces for this practical in prac13.zip, and you can read about the underlying algorithm in: T. Virtanen, Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , IEEE Tr. Audio, Speech & Lang. Proc., 15(3):1066-1074, March 2007. The Matlab script below leads you through how to use these tools and data: % Load piano example [d,sr] = wavread('gould-wtc1.wav'); % Can we separate first two notes from rest of first arpeggio? % We'll use stft.m, which is like specgram but has istft.m fftlen = 512; D = stft(d, fftlen); % Make the axis values and plot as spectrogram tt = [0:size(D,2)]*(fftlen/2)/sr; ff = [0:fftlen/2]/fftlen*sr; subplot(411) imagesc(tt, ff, 20*log10(abs(D))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) % gap between 2nd and 3rd fundamentals is about 370 Hz - which STFT bin? 370/sr*fftlen % Say bin 18 % Try separating low and high parts with STFT masking: DR = D; DR(18:end,:) = 0; subplot(412) imagesc(tt, ff, 20*log10(abs(DR))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) % Resynthesize dr = istft(DR); soundsc(dr(1:10*sr),sr); % Just lowest noes, but pretty muffled % How does the remaining part sound? Try complementary HP mask: DR = D; DR(1:17,:) = 0; imagesc(tt, ff, 20*log10(abs(DR))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) dr = istft(DR); soundsc(dr(1:10*sr),sr); % Just sounds high-pass, can still hear all notes % NMF separation of spectrogram % Use Virtanen's KL-based objective to factor STFT magnitude % Fit 40 terms r = 40; [W,H] = nmf_kl_sparse_v(abs(D), r); % Compare to original subplot(412) imagesc(tt, ff, 20*log10(abs(W*H))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) % Full reconstruction looks pretty good. Let's check components % Since component ordering is arbitrary, attempt to sort by pitch I = order_dict(W); % Now plot dictionary terms subplot(413) imagesc(1:r, ff, 20*log10(W(:,I))); axis xy; caxis([-40 20]); axis([0, r, 0 2000]) % Components mostly have clear correspondance to fundamentals % Now look at activations: subplot(414) imagesc(tt,1:r, 20*log10(H(I,:))); axis xy; caxis([-30 30]); axis([0 10 0, r]) % You can sort-of see different notes, but quite noisy % Try partial reconstruction with just lowest components X = I(1:15); % Have a look at resulting spectrogram (mask) subplot(412) imagesc(tt, ff, 20*log10(abs(W(:,X)*H(X,:)))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) % Now reconstruct from that part of the spectrogram dr = resynth_nmf(D,W,H,X); soundsc(dr(1:10*sr), sr); % Not bad at separating lowest notes, doesn't sound muffled % What about remainder? X = I(16:r); imagesc(tt, ff, 20*log10(abs(W(:,X)*H(X,:)))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) dr = resynth_nmf(D,W,H,X); soundsc(dr(1:10*sr), sr); % Some removal of lower notes, but mostly still there % Now try NMF with sparsity penalty enabled [W1,H1] = nmf_kl_sparse_v(abs(D),r,'alpha', 1); subplot(412) imagesc(tt, ff, 20*log10(abs(W1*H1))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) I1 = order_dict(W1); subplot(413) imagesc(1:r, ff, 20*log10(W1(:,I1))); axis xy; caxis([-40 20]); axis([0, r, 0 2000]) % Dictionary pretty similar subplot(414) imagesc(tt,1:r, 20*log10(H1(I1,:))); axis xy; caxis([-30 30]); axis([0 10 0, r]) % Activations are much more sparse X1 = I1(1:15); subplot(412) imagesc(tt, ff, 20*log10(abs(W1(:,X1)*H1(X1,:)))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) dr1 = resynth_nmf(D,W1,H1,X1); soundsc(dr1(1:10*sr),sr) % Good separation of low notes X1 = I1(16:r); subplot(412) imagesc(tt, ff, 20*log10(abs(W1(:,X1)*H1(X1,:)))); axis xy; caxis([-40 20]); axis([0 10 0 2000]) dr1 = resynth_nmf(D,W1,H1,X1); soundsc(dr1(1:10*sr),sr) % And now high notes have low notes mostly removed! Things to investigate:
There is no sign-off requirement this week. Weds 2013-04-24: FingerprintingOur practical investigation of fingerprinting will use a re-implementation of the Shazam fingerprint system that I put together (see also the newer version including compiled Matlab target at audfprint ). You can look at the explanation and examples on that web page to see how it works, but we will use a more recent version of the code, which is more efficient: prac12.zip. We will also use pre-built hash tables, since populating the hash table takes about 5 sec / track, which adds up for the artist20 database of 1413 tracks (6 albums across 20 artists) we are using today. We have three tables to play with: HTA20-20hps.mat (30MB) is the largest and most detailed, generated with 20 hashes/sec. HTA20-10hps.mat (17MB) is smaller since it only recorded about 10 hashes/sec for the reference items, and HTA20-10hps-20c.mat (13MB) saved only a maximum of 20 tracks per hash (instead of 100), giving it a much smaller RAM footprint of 80 MB compared to 400 MB for the other two. The Matlab script below leads you through how to use these tools and data: % Set up commands addpath('mp3readwrite'); % mp3 reading % Calculate fingerprints for some audio [d,sr] = mp3read('http://labrosa.ee.columbia.edu:8013/beatles/Revolver/01-Taxman.mp3'); % ("Let It Be" is not in this database!) % Find the landmark pairs L = find_landmarks(d,sr); size(L) L(1:10,:) % Each row of L is {start-time-col start-freq-row end-freq-row delta-time} % in the quantized units of the time-frequency cells % Visualize them superimposed on the spectrogram (zoom in on first 20 secs) show_landmarks(d,sr,L) axis([0 20 0 4000]) % We can convert each pair into a single, 20 bit hash: H = landmark2hash(L); size(H) H(1:10,:) % Each row of H is {track_id start_time hash_val}, where the track_id defaults to 0 % For building the database, we'd store the track_id and start_time keyed by the hash_val % Try loading the precalculated database global HashTable HashTableCounts load HTA20-10hps-20c whos % The simple hash table has 1048576 columns (one for each possible 20 bit value) % Each column consists of 32 bit values; the top 17 bits are the track_id % and the bottom 15 bits are the time offset within the track. % We can see which tracks contain any given hash (3rd column of H): HashTable(:,H(1,3)+1) % (we add 1 to the hash value to avoid trying to access array element 0) % Zero entries in the HashTable are where it's empty. % We get the track indices by dividing by 16384; Names{} converts the % track indices into the actual tracks recorded in the hash table Names{HashTable(1:5,H(1,3)+1)/16384} % "Taxman" is there, although it's not the only one. However, if we try another hash: Names{HashTable(1:5,H(2,3)+1)/16384} % .. we get a different set of tracks, with Taxman as the only repeat. % match_query is the main routine to search the hash table, and % illustrate_match shows the results, as described in the fingerprinting web page % You can use fingerprints to examine the relationship between two "related" tracks: [d1,sr] = mp3read('http://labrosa.ee.columbia.edu:8013/depeche_mode/Speak_and_Spell/11-Just_Can_t_Get_Enough.mp3'); [d2,sr] = mp3read('http://labrosa.ee.columbia.edu:8013/depeche_mode/Speak_and_Spell/16-Just_Can_t_Get_Enough_Schizo_Mix_.mp3'); bestlinalign(d1,sr,d2,sr); % bestlinalign attempts to find a linear time warp between the hashes in two files. % It also shows the scatter of how one track appears in the other. % Note the section between 160 and 180 s which is skewed differently from the rest % We can evaluate the fingerprinter by generating a set of queries from random positions in a few true tracks: % (we're grabbing the soundfiles across the network). % The default is 30s queries with no added white noise: tru = 100:100:1400; [Q,SR] = gen_random_queries(addprefixsuffix(Names(tru),'http://labrosa.ee.columbia.edu:8013/','.mp3')); % If we run the fingerprint query on these, we should get the "tru" track_ids back % eval_fprint just runs match_query for each element of Q and returns the top hit % You can truncate and add noise too; here, truncate to 4 sec, but very little noise (high SNR) Ttrunc = 4.0; SNR = 60; [S,R,QT] = eval_fprint(Q,SR,tru, Ttrunc, SNR); R % You should see most of the first row matching the values of tru, but sometimes not. % You can examine a match in more detail. QT returns the truncated/noised queries. % Show the matching landmarks [dm,srm] = illustrate_match(QT{2},SR,addprefixsuffix(Names,'http://labrosa.ee.columbia.edu:8013/','.mp3')); % Listen to both query and candidate match in stereo: soundsc([dm,QT{2}],SR) % (When it's wrong, it's completely wrong) % To build your own database, use clear_hashtable() then add_tracks(list) Things to investigate:
To get signed off, show the TAs a plot of fingerprinter performance vs. some systematically-varying challenge such as added noise level, bandwidth, etc., or some interesting particular fingerprinting match or false alarm. Be prepared to explain, qualitatively, your results. Followup: I got my routine to recover the hashes directly from the database, and was able to make a full list of all the tracks in the database with significant overlap. The new function get_hashes_for_track is now in the code package: % Query the hashtable with the hashes of each reference item for i = 1:length(Names); ... R = match_query(get_hashes_for_track(i)); ... bb(i) = R(2,1); ... bc(i) = R(2,2); ... end % R(1,:) will be the query item itself (since it is in the database) % but R(1,:) will be the best matching other item in the database. % We store both its ID (in bb) and the number of matching hashes (bc). % Looking at the number of matching hashes will reveal non-chance matches. plot(bb) Names{(find(bc>10))} % Turns out there are five pairs of remixed/alternate versions in the database % as well as one track with a different vocal over the same backing loop,. % and one long gratuitous quote ("Hammer to Fall" in "I Want To Break Free (Extended Mix)"). Weds 2013-04-17: Incipit MatchingThis week we will use the Echo Nest Analyze API to find "incipits" -- fragments of music from the beginnings of phrases -- and search for them within a database of recordings. We'll be working in Matlab. The MATLAB script below is mostly self-explanatory. You can download the scripts in prac11.zip and data in AllIncipits.mat (43MB). % Set up commands addpath('mp3readwrite'); % mp3 reading % Run the Echo Nest Analyze on an MP3 file ENA = en_analyze('test.mp3',1) % Look at the results - plot the per-segment chroma using the segment times axs = subplot(311) plot_chroma(ENA.pitches, ENA.segment); % Compare to spectrogram [d,sr] = mp3read('test.mp3',0,1,2); ax = subplot(312) specgram(d,512,sr); caxis([-30 30]); colormap(1-gray); % (linkaxes will let you scroll the panes in sync) axs = [axs,ax]; linkaxes(axs,'x') axis([0 30 0 5000]); % Superimpose the segment start times to check they make sense overplot_times(ENA.segment,'y'); % Now look at the beat times overplot_times(ENA.beat,'r'); % .. and the bar times overplot_times(ENA.bar,'b'); % .. and the sections (major phrase breaks) overplot_times(ENA.section,'g'); % Do the sections make sense? Listen to 10 s excerpts soundsc(seltime(d,sr,ENA.section(1),ENA.section(1)+10),sr); soundsc(seltime(d,sr,ENA.section(2),ENA.section(2)+10),sr); % sometimes... % Calculate the beat-synchronous chroma from the segments BC = en_beatchroma(ENA); ax = subplot(313); plot_chroma(BC,ENA.beat); axs = [axs,ax]; linkaxes(axs,'x'); % We define incipits as the first N beats after each % section (starting from the nearest bar division), % and represent them with beat-chroma matrices [In,St,En,Bt] = make_incipits(ENA); subplot(321) In1 = squeeze(In(1,:,:)); imagesc(In1); axis xy title('1st incipit of test track') % Incipits are key-normalized, so may not exactly match raw beat-chroma % St, En are the actual start and end times % listen to audio soundsc(seltime(d,sr,St(1),En(1)),sr); % compare to chroma resynthesis % (make sure synthesis times start from 0) soundsc(synthesize_chroma(In1,ENA.beat(Bt(1)+[0:31])-ENA.beat(Bt(1)),sr),sr) % AllIncipits.mat contains incipits from 8000+ tracks of uspop2002 AI = load('AllIncipits.mat'); % Calculate the distance to all of them % Incipits are stored in AI.Incipits as unravelled vectors of 384 (=12x32) values % Just match on the first 16 beats (of 32) - a smaller space mb = 16; nchr = 12; dist = sqrt(sum((AI.Incipits(:,1:mb*nchr) - ... repmat(reshape(In1(:,1:mb),1,mb*nchr),size(AI.Incipits,1),1)).^2,2)); % Sort by distance [vv,xx] = sort(dist); % Plot the most similar one ix = xx(1); subplot(323) imagesc(reshape(AI.Incipits(ix,:),12,32)); axis xy % Which tracks is it? title(['Incipit from ',AI.Names{AI.Tracks(ix)},' @ time ',num2str(AI.Starts(ix))], ... 'Interpreter','none') % ('Interpreter' stops it trying to translate underscores) % Listen to the chroma resynth to see if it's similar soundsc(synthesize_chroma(reshape(AI.Incipits(ix,:),12,32),0.35,sr),sr) % Download & listen to the original audio [d2,sr2] = mp3read(['http://labrosa.ee.columbia.edu:8013/',AI.Names{AI.Tracks(ix)},'.mp3']); soundsc(seltime(d2,sr2,AI.Starts(ix),AI.Ends(ix)),sr2); % Similar? Things to try:
To sign off, show the TAs one "interesting" match pair, and convince them why you think it is interesting. Weds 2013-04-10: Chord RecognitionThis week we will train and use a simple Hidden Markov Model to do chord recognition. We will be using precomputed chroma features along with the ground-truth chord labels for the Beatles opus that were created by Chris Harte of Queen Mary, University of London. This practical is all run in Matlab. Here's an example of using the Matlab scripts. You can download them all as chords_code.zip, and the associated data as chords_data.zip.
TrainFileList = textread('trainfilelist.txt','%s');
% Load beat-synchronous chroma for "Let It Be" - item 135
[Chroma,Times] = load_chroma(TrainFileList{135});
% Resynthesize with Shepard tones
SR = 16000;
X = synthesize_chroma(Chroma,Times,SR);
% Listen to first 20 seconds
soundsc(X(1:20*SR),SR)
% Somewhat recognizable
% Train Gaussian models for each chord from whole training set
[Models,Transitions,Priors] = train_chord_models(TrainFileList);
% Look at the means of the 25 learned models (nochord + 12 major + 12 minor)
for i = 1:25; MM(:,i) = Models(i).mean'; end
imagesc(MM)
% Try recognizing chords in Let It Be (which was in the train set, so cheating)
[HypChords, LHoods] = recognize_chords(Chroma,Models,Transitions,Priors);
% We can look at the best (Viterbi) path overlaid on the per-frame log likelihoods
imagesc(max(-10,log10(LHoods)));
colormap(1-gray)
colorbar
hold on; plot(HypChords+1,'-r'); hold off
% Look just at the first hundred beats
axis([0 100 0.5 25.5])
xlabel('time / beats');
ylabel('chord');
keylabels = '-|C|C#|D|D#|E|F|F#|G|G#|A|A#|B|c|c#|d|d#|e|f|f#|g|g#|a|a#|b';
set(gca,'YTick',1:25);
set(gca,'YTickLabel',keylabels);
% Compare the Viterbi (HMM) path to the simple most-likely model for each frame
[Val,Idx] = max(LHoods);
hold on; plot(Idx, 'og'); hold off
% The HMM transition matrix makes it more likely to stay in any given state,
% thus it smooths the chord sequence (eliminates single-frame chords)
% Evaluate accuracy compared to ground-truth
TrueChords = load_labels(TrainFileList{135});
% Add the true labels to the plot
hold on; plot(TrueChords+1, '.y'); hold off
legend('Viterbi','Best','True')
% HypChords and TrueChords are simple vectors of labels in range 0..24.
% What is the average accuracy for this track?
mean(HypChords==TrueChords)
% 71.5% - pretty good!
% For reference, the best per-frame model, without the HMM, gives
mean(Idx-1 == TrueChords) % subtract 1 to convert indices 1..25 into chords 0..24
% 44.9% - nowhere near as good
% To get the full confusion matrix (rows=true, cols=recognized as):
[S,C] = score_chord_recognition(HypChords,TrueChords);
imagesc(C);
set(gca,'XTick',1:25);
set(gca,'XTickLabel',keylabels);
set(gca,'YTick',1:25);
set(gca,'YTickLabel',keylabels);
% Most common confusion is F being recognized as C.
% What do the true chords sound like when rendered as Shepard tones?
LabelChroma = labels_to_chroma(TrueChords);
% .. creates a simple chroma array with canonical triads for each chord
X2 = synthesize_chroma(LabelChroma,Times,SR);
soundsc(X2(1:20*SR),SR)
% Compare "target" chroma, actual chroma, and both true and hypothesized labels
subplot(311)
imagesc(LabelChroma);
axis xy
set(gca, 'YTick', [1 3 5 8 10 12]'); set(gca, 'YTickLabel', 'C|D|E|G|A|B');
subplot(312)
imagesc(Chroma);
axis xy
set(gca, 'YTick', [1 3 5 8 10 12]'); set(gca, 'YTickLabel', 'C|D|E|G|A|B');
subplot(313)
plot(1:length(TrueChords),TrueChords,'o',1:length(HypChords),HypChords,'.r')
legend('True','Hyp')
set(gca,'YTick',[1 3 5 8 10 13 15 17 20 22]); set(gca,'YTickLabel','C|D|E|G|A|c|d|e|g|a');
axis([0 length(TrueChords) 0 25])
colormap hot
% This gives the picture above
% Evaluate recognition over entire test set
TestFileList = textread('testfilelist.txt','%s');
[S,C] = test_chord_models(TestFileList,Models,Transitions,Priors);
% Overall recognition accuracy = 57.7%
Things to try:
Notes: The code includes and makes use of
gaussian_prob.m,
viterbi_path.m, and
normalise.m,
all from
Kevin Murphy's wonderful HMM Toolbox. This week's practical is the basis of Mini Project 3, so you don't need to get signed off, but simply submit your third and final miniproject in two weeks (on Weds April 24th). Weds 2013-04-03: Beat trackingAs promised, we now move on from the real-time processing of Pd to do some offline analysis using Matlab. (In fact, beat tracking in real time is an interesting and worthwhile problem, but doing it offline is much simpler.) I've put together a cut-down version of my dynamic programming beat tracker for us to play with. It includes:
There are also a number of helper/utility functions:
All these functions are available in prac09.zip. The collection of 20 example excerpts and human tapping data that McKinney and Moelants donated for MIREX 2006 (some 50 MB) is separately available as mirex06examples.zip. Here are some things to try:
To get signed off, show the TAs an example of a modification to the code or parameters you made, and an explanation of how and why it changed the overall performance of the beat tracker. Weds 2013-03-27: AutotuneGiven pitch tracking and pitch modification, we can now put them both together to modify the pitch towards a target derived from the current input pitch, i.e., autotune, in which a singer's pitch is moved to the nearest exact note to compensate for problems in their intonation. We can use sigmund both to track the singing pitch, and to analyze the voice into sinusoids which we can then resynthesize after possibly changing the pitch. We'll use the following Pd patches:
You can download all these patches in prac08.zip. Loading a sound file then playing it into the patch should generate a close copy of the original voice, but quantized to semitone pitches. The "pitch smoothing" slider controls how abruptly the pitch moves between notes. Try it on some voice files, such as the Marvin Gaye voice.wav, the query-by-singing example 00014.wav, or my pitch sweep ahh.wav. You can also try it on live input by hooking up the adc~ instead of the soundfile playback, but you will probably need to use headphones to avoid feedback. Here are some things to investigate:
This week's practical is the basis of Mini Project 2, so you don't need to demonstrate anything, just submit your project report in two weeks (on Weds April 10th). Weds 2013-03-13: Pitch trackingMiller Puckette (author of Pd) created a complex pitch tracking object called sigmund~. This week we'll investigate its use and function. You will use the following Pd patches:
You can download these in prac07.zip. sigmund~ operates in various different modes - as a raw pitch tracker, as a note detector/segmenter, and also as a sinusoid tracker. We'll try each mode.
For credit, you must show a TA either (a) an interesting case where the sigmund pitch tracker makes an obvious error (with your explanation), or (b) an example of successful sigmund note detection, showing how this depends on the parameters, or (c) some kind of note transformation using the sinusoid representation. Weds 2013-03-06: ReverbFor this week's practical you will examine a reverberation algorithm, trying to understand the link between the algorithm controls and pieces, and the subjective experience of the reverberation. We will be working with the algorithm in the rev2~ reverberator patch that comes with Pd, although we'll be modifying our own version of it. It's based on the design described in the 1982 Stautner and Puckette paper. You will use the following Pd patches:
You can download them all in prac06.zip. The main test harness allows you to adjust the control parameters of the reverb patch, and to feed in impulses, short tone bursts of different durations, or sound files. You can also sample the impulse response and write it out to a wave file, to be analyzed by an external program.
Here are some sound files you can use to try out the reverberator: voice.wav, guitar.wav, drums.wav. Your task is show one of the TAs a modified version of the reverb patch, showing how your modification changed the impulse response of the reverberator, and how that change affected the sound. Wed 2013-02-27: LPCThis week we will experiment with LPC analysis/synthesis in Pd using the lpc~ and lpcreson~ units by Edward Kelly and Nicolas Chetry, which are included with the Pd-extended package. The patch lpc.pd uses these units to perform LPC analysis and synthesis, including taking the LPC filters based on one sound and applying them to a different sound (cross-synthesis). The main patch handles loading, playing, and selecting the soundfiles, and uses the additional patches loadsoundfile.pd to read in sound files, playloop.pd to play or loop a sound file, and audiosel~.pd to allow selecting one of several audio streams. The LPC part is done by the lpcanalysis subpatch of the main patch, shown to the right. It "pre-emphasizes" both voice and excitation to boost high frequencies, then applies lpc~ to the voice signal to generate a set of LPC filter coefficients and a residual. The [myenv] subpatch then calculates the energy (envelope) of the residual, and the excitation is scaled to reflect this envelope. This excitation, along with the original filter coefficients (delayed by one block to match the delay introduced by [myenv]), is passed to lpreson~, which simply implements the all-pole filter to impose the voice's formant structure on the excitation. A final de-emphasis re-balances low and high freqencies. You can download these patches in prac05.zip. The entire lpcanalysis subpatch is applied to overlapping windows of 1024 samples at half the sampling rate of the parent patch (i.e. 1024/22050 = 46.4 ms) thanks to the block~ unit, a special feature of Pd which allows subpatches to have different blocking etc. from their parents. On coming out of the subpatch, Pd will overlap-add as necessary, so we apply a final tapered window (from the $0-hanning array) to the outputs. tabreceive~ repeatedly reads the hanning window on every frame. Here is a list of options for things to investigate:
For credit, you must explain to a TA the results of two of the options 3-7, or show implementations of 8,9 or 10. Weds 2013-02-20: Sinusoidal SynthesisThis week we use Pd to perform additive synthesis by controlling the frequencies and magnitudes of a bank of sinewave oscillators based on parameters read from an analysis file written by the SPEAR program we saw in class. The main additive patch instantiates a bank of 32 oscillators, and provides the controls to load an analysis file, to trigger playback, and to modify the time and frequency scale. The actual parsing of the SPEAR file is provided by loadspearfile, and the individual sinusoid partials are rendered by mypartial. Here are some analysis files for notes from a violin, trumpet, and guitar. You can download all these files in prac04.zip, You can experiment with playing back each of these, turning individual partials on or off, and adjusting the time and frequency scales. When the analysis file contains more harmonics than the number of oscillators (32), some of the sinusoids are dropped from the synthesis. You can identify (roughly) which harmonic is which by the average frequency and summed-up magnitude of each harmonic, which gets displayed on each individual [mypartial] patch in additive.pd. Here are the things to try:
Note: if you modify the mypartial patch, it's probably a good idea to close and reopen the parent additive patch. I'm not sure how good Pd is at keeping multiple instantiations in sync when one instance is edited. Re-opening the parent patch will re-instantiate all the mypartials, so they should all get updated. For credit this week, you'll need to create one sinusoidal model of your own (option 5), then implement one or more of options 1-4. Weds 2013-02-13: Analog synthesisThis week we'll experiment with simulating an analog synthesizer with Pd. The Pd analog synth simulator consists of several patches:
You can download all these patches along with some support functions in the zip file prac03.zip. Load demo_voice.pd into Pd, and the synth should run. Things to investigate:
This week's practical is the basis of mini-project 1, so you won't have to get it signed off; instead, you'll have to submit a short report in two weeks time. Weds 2013-02-06: FilteringLast week we looked a fairly complex structure built in Pd. This week, we'll back up a bit and play with some simple filters within Pd. Pd provides a range of built-in filtering objects: [lop~], [hip~], [bp~] (and its sample-rate-controllable twin [vcf~]), the more general [biquad~], and the elemental [rpole~], [cpole~] and [rzero~], [czero~] (see the filters section of the Floss Pd manual, and the Subtractive Synthesis chapter of Johannes Kreidler's Pd Tutorial). The patch demo_filters.pd provides a framework to play with these filter types. It uses playsound~.pd to allow you to play (and loop) a WAV file, select4~.pd to select one of 4 audio streams via "radio buttons", and plotpowspec~.pd (slightly improved from last week) to plot the running FFT spectrum, as well as the [output~] built-in from Pd-extended. You can download all these patches in prac02.zip. Try listening to the various sound inputs provided by the top [select4~] through the different filters provided by the bottom [select4~]. Try changing the cutoff frequency with the slider (as well as the Q of the bpf with the number box); listen to the results and look at the short-time spectrum. You can try loading the speech and guitar samples into the [playsound~] unit to see how filtering affects "real" sounds (click the button at the bottom right of [playsound~] to bring up the file selection dialog; click the button on the left to play the sound, and check the toggle to make it loop indefinitely). Here are a few further experiments you could try:
To get credit for this week's practical, you must show working solutions to one of options 4, 5, 6, or 7 above. Weds 2013-01-30: Plucked stringThis week's practical looks at the Karplus-Strong plucked string simulation in Pure Data (Pd). The general pattern of these weekly practical sessions is to give you a piece of code to start with, then ask you to investigate some aspects, by using and changing the code. However, the areas to investigate are left somewhat open, in the hope that we'll each discover different things -- that we can then share. We start with demo_karpluck.pd, my wrapper around Loomer's karpluck~.pd. In addition to the keybd.pd patch used to provide MIDI-like controls from the computer keyboard, this one also uses grapher~.pd to provide an oscilloscope-like time-domain waveform plot, and plotpowspec~.pd (based on the original by Ron Weiss) to provide a smoothed Fourier transform view. You can download all these patches in prac01.zip, then open the demo_karpluck.pd patch to run the demo. This patch provides three main controls for the sound of the plucked string:
To get going, try playing with these three parameters to see what kinds of "real" sounds you can approximate. Can you put "physical" interpretations on these parameters? Here are some things to try:
Credit: To be credited with completing this practical, you'll need to show one of the teaching staff a working instance of one of these suggestions. ![]() This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. Dan Ellis <dpwe@ee.columbia.edu> Last updated: Wed May 01 11:27:39 AM EDT 2013 | ||