Next: 5.14 Version 1.5 Compatibility Up: 5 Speech Input/Output Previous: 5.12 Viewing Speech with HLIST

5.13 Copying and Coding using HCOPY

HCOPY is a general-purpose tool for copying and manipulating speech files. The general form of invocation is

    HCopy src tgt

which will make a new copy called tgt of the file called src. HCOPY can also concatenate several sources together as in

    HCopy src1 + src2 + src3 tgt

which concatenates the contents of src1, src2 and src3, storing the results in the file tgt. As well as putting speech files together, HCOPY can also take them apart. For example,

    HCopy -b 100 -e -100 src tgt

will extract samples 100 through to N-100 of the file src to the file tgt where N is the total number of samples in the source file. The range of samples to be copied can also be specified with reference to a label file, and modifications made to the speech file can be tracked in a copy of the label file. All of the various options provided by HCOPY are given in the reference section and in total they provide a powerful facility for manipulating speech data files.

However, the use of HCOPY extends beyond that of copying, chopping and concatenating files. HCOPY reads in all files using the speech input/output subsystem described in the preceding sections. Hence, by specifying an appropriate configuration file, HCOPY is also a speech coding tool. For example, if the configuration file config was set-up to convert waveform data to MFCC coefficients, the command

    HCopy -C config -b 100 -e -100 src.wav tgt.mfc

would parameterise the file waveform file src.wav, excluding the first and last 100 samples, and store the result in tgt.mfc.

HCOPY will process its arguments in pairs, and as with all HTK tools, argument lists can be written in a script file specified via the -S option. When coding a large database, the separate invocation of HCOPY for each file needing to be processed would incur a very large overhead. Hence, it is better to create a file, flist say, containing a list of all source and target files, as in for example,

    src1.wav tgt1.mfc
    src2.wav tgt2.mfc
    src3.wav tgt3.mfc
    src4.wav tgt4.mfc
    etc

and then invoke HCOPY by

    HCopy -C config -b 100 -e -100 -S flist

which would encode each file listed in flist in a single invocation.

Normally HCOPY makes a direct copy of the target speech data in the output file. However, if the configuration parameter SAVECOMPRESSED is set true then the output is saved in compressed form and if the configuration parameter SAVEWITHCRC is set true then a checksum is appended to the output (see section 5.7). If the configuration parameter SAVEASVQ is set true then only VQ indices are saved and the kind of the target file is changed to DISCRETE. For this to work, the target kind must have the qualifier _V attached (see section 5.11).

tex2html_wrap19984

Next: 5.14 Version 1.5 Compatibility Up: 5 Speech Input/Output Previous: 5.12 Viewing Speech with HLIST

ECRL HTK_V2.1: email [email protected]