In this optional exercise, you have the opportunity to improve on
your MFCC front end. To get started, type the commands:
mkdir -p ~/e6884/lab1ec/
cd ~/e6884/lab1ec/
cp ~stanchen/e6884/lab1/.mk_chain .
|
Copy over your
Lab1_FE.C from Part 1, and also
Lab1_DTW.H from Part 2 if you have one and like it.
If you want to add additional parameters to your front end
and have figured out how to do this, then you can also grab
a copy of
Lab1_FE.H from
~stanchen/e6884/lab1/.
For this exercise, you will need to compile DcdDTW like so:
We have provided two development sets for optimizing your
front ends, a mini-test set consisting of ~100 utterances
and a larger test set consisting of ~1000 utterances.
To run the
DcdDTW in the current directory on these
test sets, run
lab1p4small.sh
lab1p4large.sh
|
for the small and large test sets, respectively. These scripts
are set up to use the full MFCC pipeline (windowing + FFT + melbin w/ log +
DCT), and you can change what signal processing is done by modifying
the modules you developed in Part 1 of the lab. If the algorithms you would
like to implement cannot be realized within this framework (
e.g.,
you don't want to do an FFT), please contact one of the professors
and we can tell you how to do this.
The task is set up to be speaker-independent: the speaker
used to provide the templates for a test set may have no
relation to the speaker of that test set.
The evaluation test set we will use to determine which front end
wins the “Best Front End” award will not be released until after this
assignment is due, to prevent the temptation of developing techniques
that may only work well on the development test sets. This is
consistent with the evaluation paradigm used in government-sponsored
speech recognition competitions, the primary ones being the
[NIST Spoken Language
Technology Evaluations].