5. Part 3: Familiarize yourself with the programming conventions used in this course

To get yourself familiarized with what the programming exercises will be like, we will go through a mini-exercise. To make it possible for you only have to write the interesting bits of code in a speech recognizer for the labs, we have written extensive amounts of “glue” code. Each program will be compiled from a large number of C++ files, but almost all of these files have already been written for you. We will just leave out parts from a file or two that you will have to fill in.

To see what this is like, let's get started on the mini-exercise. First, create a new subdirectory for us to work in and go there:
mkdir -p ~/e6884/lab0/
cd ~/e6884/lab0/
Next, let's copy over a couple files that we will need:
cp ~stanchen/e6884/lab0/Lab0_FE.C .
cp ~stanchen/e6884/lab0/.mk_chain .
The file Lab0_FE.C is the C++ source file that you will be editing for the exercise. The file .mk_chain is a file that we will need for compilation; it holds where all the other source files that we be compiled in are located.

Now, open the file Lab0_FE.C in a text editor. You might notice that there are a bunch of weird contructs in the file that you don't understand. Don't freak out yet; there will be plenty of time for this later. Look for the markers BEGIN_LAB and END_LAB near the end of the file. This is the only section of the file you need to understand. The rest of the file can be ignored, though you may want to read the comments there for your own edification. (If you are interested in exploring the related header files and source code, look in the following directories:
~stanchen/pub/zeeapi/inc/
~stanchen/pub/zeeapi/src/
~stanchen/pub/zeelib/inc/
~stanchen/pub/zeelib/src/
~stanchen/pub/e6884/inc/
~stanchen/pub/e6884/src/
For example, Lab0_FE.H is located in ~stanchen/pub/e6884/inc/.)

In this exercise, you will be writing a simple signal processing module that takes as input a 2-D array containing a vector of floating-point numbers (or features) for each time unit (or frame) in a speech signal, and outputs a scaled version of the array. Since this is Lab 0, we are going to tell you what the answer is. Type/paste in the following code between the BEGIN_LAB and END_LAB markers:
for (int frm = 0; frm < inFrames; ++frm)
    {
    for (int dim = 0; dim < inDim; ++dim)
        outBuf[frm][dim] = inBuf[frm][dim] * scaleFactorM;
    }
Notice that the variables outBuf and inBuf behave like 2-D arrays in C. In reality, they are C++ objects, but you don't need to worry about this. Also notice that these arrays have already been sized correctly; we will do this for you whenever possible to make your life easier. The scaling constant scaleFactorM has also been mysteriously initialized for you. In fact, this parameter can be set on the command line of the programs that this file will be compiled into, but again, how this happens does not concern you at this time. Anyway, we are now done with the programming portion of this exercise.

In terms of the bigger picture, we can view signal processing in ASR as being comprised of a number of processing steps applied in sequence. Each processing module takes the matrix of values produced by the last module (consisting of feature values for each frame in an utterance) and generates a matrix of values to be fed to the next module. The above example implements a module that does simple scaling; in Lab 1, you'll be implementing a number of modules needed in producing MFCC features.