This page stands as a proof of concept for the cascade time-frequency linear prediction (CTFLP) algorithm as it appears in the paper:
M. Athineos and D. Ellis (2003). Sound Texture Modelling with Linear Prediction in both Time and Frequency Domains. Proc. ICASSP-03, Hong Kong, April 2003 (to appear). (4pp)
It demonstrates the modeling and time-stretching capabilities of CTFLP on rough noisy textures by contrasting it with existing schemes.
The column TDLP presents the results of a standard noise-excited time domain linear prediction (TDLP) for various stretching factors.
The column CTFLP presents the results of our noise-excited cascade time-frequency linear prediction for various stretching factors.
The column PVoc presents the results of Phase Vocoder stretching (note that this method by definition is not noise-excited).
The column SR presents the results of simple stretching by resampling (note that this method by definition is also not noise-excited).
In order to be fair to TDLP, CTFLP and PVoc, the analysis window was kept the same across all methods and all sounds with length 512 samples (23ms @ 22050) and 50% overlap. Moreover, we used 50 poles per frame for TDLP and 40 time- / 10 frequency- domain poles per frame for CTFLP in all examples which means we are fair on the pole rate too. Note that even better resynthesis results can be achieved for CTFLP by fine tuning the pole allocation but this is not the purpose of this test.
First one can listen to the 1x TDLP resynthesis and observe what a conventional, spectral envelope based, noise-exited resynthesis scheme can achieve. Then by listening to the 1x CTFLP, the improvement in quality of the resynthesis can be immediately noticed. Remember that one way to compare the two schemes is by thinking that we are taking 10 poles from the spectral envelope fit and we allocate them in constructing a temporal envelope. The roughness of the textures is now preserved.
The second step is to compare the stretched versions where the temporal resolution of the resynthesis is in a sense "magnified". It easy now to observe the time smearing artifacts of spectral envelope based techniques like the TDLP and the Phase Vocoder (PVoc). In contrast, the proposed CTFLP method preserves the intelligibility of individual microtransients up to 8x stretches.
(Click on the name of each example to download the original sound. The format is .wav, PCM 16-bit, 22050 Hz Mono)