HCOPY is a general-purpose tool for copying and manipulating speech files. The general form of invocation is
HCopy src tgtwhich will make a new copy called tgt of the file called src. HCOPY can also concatenate several sources together as in
HCopy src1 + src2 + src3 tgtwhich concatenates the contents of src1, src2 and src3, storing the results in the file tgt. As well as putting speech files together, HCOPY can also take them apart. For example,
HCopy -b 100 -e -100 src tgtwill extract samples 100 through to N-100 of the file src to the file tgt where N is the total number of samples in the source file. The range of samples to be copied can also be specified with reference to a label file, and modifications made to the speech file can be tracked in a copy of the label file. All of the various options provided by HCOPY are given in the reference section and in total they provide a powerful facility for manipulating speech data files.
However, the use of HCOPY extends beyond that of copying, chopping and concatenating files. HCOPY reads in all files using the speech input/output subsystem described in the preceding sections. Hence, by specifying an appropriate configuration file, HCOPY is also a speech coding tool. For example, if the configuration file config was set-up to convert waveform data to MFCC coefficients, the command
HCopy -C config -b 100 -e -100 src.wav tgt.mfcwould parameterise the file waveform file src.wav, excluding the first and last 100 samples, and store the result in tgt.mfc.
HCOPY will process its arguments in pairs, and as with all HTK tools, argument lists can be written in a script file specified via the -S option. When coding a large database, the separate invocation of HCOPY for each file needing to be processed would incur a very large overhead. Hence, it is better to create a file, flist say, containing a list of all source and target files, as in for example,
src1.wav tgt1.mfc src2.wav tgt2.mfc src3.wav tgt3.mfc src4.wav tgt4.mfc etcand then invoke HCOPY by
HCopy -C config -b 100 -e -100 -S flistwhich would encode each file listed in flist in a single invocation.
Normally HCOPY makes a direct copy of the target speech data in the output file. However, if the configuration parameter SAVECOMPRESSED is set true then the output is saved in compressed form and if the configuration parameter SAVEWITHCRC is set true then a checksum is appended to the output (see section 5.7). If the configuration parameter SAVEASVQ is set true then only VQ indices are saved and the kind of the target file is changed to DISCRETE. For this to work, the target kind must have the qualifier _V attached (see section 5.11).