SAcC_train_v1.0
This package shows how to train SAcC on a new dataset. Using genPitchLabel, the pseudo ground truth f0 can be generated from the outputs of multiple pitch trackers. (e.g. YIN, Wu, and SWIPE')
You can download a zip file containing demo_SAcC_train.m and the other files used in this demo from SAcC_train_v1.0.zip.
Note that the dataset files are not included. Please change the scripts files to run this demo according to the current dataset.
Contents
----------------------------------------
OBJECTIVE:
Procedure for training Subband Autocorrelation Classification (SAcC) pitch tracker on a new speech corpus [1].
PROCEDURE:
a. generate the labels using genPitchLabel with YIN, Wu, and SWIPE'. (the audio files are read from list files.) Save the labels in pfile format. b. calculate subband PCA features and save in pfile for MLP training c. train MLP from pfile using QuickNet
REFERENCE:
[1] Byung Suk Lee, Noise Robust Pitch Tracking by Subband Autocorrelation Classification (SAcC), PhD Thesis, Columbia University, 2012 ----------------------------------------
USAGE:
This package includes an example script:
demo_SAcC_train.m
This example trains a SAcC for a Babel dataset. Due to large volumn, the dataset is not included. To run this script, the following is required.
Software packages:
audioread, YIN, Wu, SWIPEP, QuickNet, genPitchLabel (Please update the paths of the software.)
Training data:
lists/Babel_tr.list (The files in the list are required or to be updated as needed.)
Subband PCA file:
../../dat/keele/rbf_pinknoise/out/PCA_sr8000_bpo6_nchs30.mat (Pleas update the path.)
To train on a new dataset, replace the list with the new data files.
conf_fn = 'conf/config.trBabel_sr8k_bpo6_sb24_k10'; P = load_config(conf_fn); maxyth = 0.5900; maxsth = 0.1200; dispflag = 0; out_base = 'logs/'; tst_base = 'logs/tst/'; if ~exist(out_base,'dir'), mkdir(out_base); end; if ~exist(tst_base,'dir'), mkdir(tst_base); end; reject_target = 1; nMLP = P.npcf + 1; % +1 for no-pitch cexp = P.qe; allR = []; tstR = []; name_base = 'Babel_'; tr_list_file = 'lists/Babel_tr.list'; tst_list_file = 'lists/Babel_tst.list'; tr_dat_pfile = 'tmp/Babel_tr_sr8k_bpo6_sb24_k10.pfile'; tst_dat_pfile = 'tmp/Babel_tst_sr8k_bpo6_sb24_k10.pfile'; % a. Generate pseudo pitch label using genPitchLabel with Wu, YIN, SWIPE' cal_agreed_label(tr_list_file,P.pcf,maxyth,maxsth,dispflag); cal_agreed_label(tst_list_file,P.pcf,maxyth,maxsth,dispflag); % b. Calculate Subband Autocorrelation PCA and save in pfiles [tr_l_pfile] = preprocessing_sbac(tr_dat_pfile,tr_list_file,P.sr,P.mapping,P.b2,P.a2,P.t2); [tst_l_pfile] = preprocessing_sbac(tst_dat_pfile,tst_list_file,P.sr,P.mapping,P.b2,P.a2,P.t2); % c. Train MLP using QuickNet [Rd,Ud,Fd,Ld,Td,hdr_sized] = pfinfo(tr_dat_pfile); trRange = ['0:',num2str(round(P.trR*(Ud+1)))]; [d_path,d_name,d_ext] = fileparts(tr_dat_pfile); n_fn = [d_path,'/',d_name,'.norms']; % generate norm file if ~exist(n_fn,'file') qnnrm = 'qnnorm '; qo1 = [' norm_ftrfile=',tr_dat_pfile]; qo2 = [' output_normfile=',n_fn]; cmd = [qnnrm,qo1,qo2]; disp(cmd); system(cmd); end TRname = [name_base,'h',num2str(P.hids)]; w_fn = [out_base,'wgt/',TRname,'.wgt']; if ~exist([out_base,'wgt/'],'dir'), mkdir([out_base,'wgt/']); end; w_log = [out_base,'log/',TRname,'.log']; if ~exist([out_base,'log/'],'dir'), mkdir([out_base,'log/']); end; w_chk = [out_base,'chk/',TRname,'.chk']; if ~exist([out_base,'chk/'],'dir'), mkdir([out_base,'chk/']); end; w_chklog = [out_base,'chk/',TRname,'.log']; if ~exist(w_fn,'file') qnstrn_wrap(tr_dat_pfile,n_fn,tr_l_fn,w_fn,w_log,w_chklog,w_chk,Fd,Ud,allR,trRange,P.hids,P.wl,nMLP,cexp,[],[],reject_target); end
alg=SAcC_fast nchs=24 sr=8000 fmin=100 bpo=6 q=8 n=2 ftype=2 dataset=Babel ntype=radio_channel tst_dataset=Babel tst_ntype=radio_channel feature_names=sbpca_features pca_file=../../dat/keele/rbf_pinknoise/out/PCA_sr8000_bpo6_nchs30.mat kdim=10 QN_Learn_Rate_Param=0 trR=0.6 hids=100 n_s=10 train_src_list=lists/sub-qtr-rats-src.list test_src_list=lists/tst-qtr-rats-src.list train_file=tmp/tr_rats_sr8k_bpo6_sb24_k10.pfile train_label=tmp/sub-qtr-rats-lab.pfile test_file=tmp/tst_rats_sr8k_bpo6_sb24_k10.pfile test_label=tmp/tst-sub-qtr-rats-lab.pfile Using Slaney-Patterson filterbank, frq=100..1600, bpo=6
...
----------------------------------------
CONTACT:
Byung Suk Lee, [email protected] Dan Ellis, [email protected] LabROSA, Columbia University, 2012-09-13
Byung Suk Lee <[email protected]>