Lab 4 --- Large Vocabulary Decoding: A Love Story
Prev		Next

5. Part 4: Add Support for Beam and Rank Pruning to a Large-Vocabulary Decoder

In the first task for this part, you need to implement beam pruning. For each frame, the variable threshLogProb should be set to b - logProbBeam, where b is the highest logprob of a cell in the last frame (i.e., lastFrame). However, it is unacceptable to just loop through all cells in lastFrame right before setting threshLogProb, as we are concerned about efficiency and it is possible to solve this task without adding any new loops.

Once you have solved this, don't forget to recompile DcdLab4. To run this program on a small sample test set, run
lab4p4a.sh
In this part, we use the same models as before, but we use slightly bigger decoding graphs than in the last part. (Again, for speed's sake, we do not use the full big static decoding graph.) Instead of using a graph that contains just the reference word sequence (more or less), we made a graph that contains all 245 utterances in our full test set. (Thus, the decoding task is to decide which of the 245 test utterances the current utterance is, rather than true unconstrained word decoding.) The target output can be found in the file p4a.out in ~stanchen/e6884/lab4/. You should try to come close to matching the number of active states reported. The instructions in lab4.txt will ask you to run the script lab4p4b.sh, which does the same thing as lab4p4a.sh except on a different test set.

In the second task for this part, you need to implement rank pruning. Look in the code to see where the rank pruning code should be placed. Note that we provide an example of how to loop through all active cells. (For rank pruning, unlike in beam pruning, it is unclear there is a more efficient way of doing this other than to at least loop through all cells.) In rank pruning, we want to keep the maxRank cells with the highest logprobs. However, it is OK if you keep slightly more than maxRank cells (e.g., if you do a bucket sort). For this part, set the variable rankThreshLogProb to the logprob value such that there are (about) maxRank cells in the last frame with at least this logprob. Notice that the rank pruning code will only be called if there are indeed at least maxRank active cells in the last frame.

Once you have solved this, you again need to recompile DcdLab4. To run this program on the same test set as before, run
lab4p4c.sh
Sample output can be found in the file p4c.out in ~stanchen/e6884/lab4/. The maximum number of active states should be in the ballpark of 1000, which is what maxRank is set to in this script, though it is OK if it is slightly larger. You need not match the target output exactly in terms of active states, as your implementation may be completely different from our sample implementation. To give a ballpark figure for how fast a pretty efficient implementation is, our sample implementation runs lab4p4c.sh in about 0.42 “times real time” (xRT); i.e., on average, it takes 0.42*t seconds to process an utterance that is t seconds long. The program DcdLab4 outputs the real time factor at the end of each run. (It uses CPU time to compute this rather than elapsed time, so it should be fine if someone else is concurrently running a job on the same machine as you.) You don't want your implementation to be too slow, or Part 5 of the lab will involve an excessive amount of waiting.

Prev	Home	Next
Part 3: Add Support for Word Traceback and Skip Arcs to a Large-Vocabulary Decoder		Part 5: Evaluate the Performance of Various Models on WSJ Test Data