Masami Mizutani, Shahram Ebadollahi, Shih-Fu Chang. Commercial Detection in Heterogeneous Video Streams Using Fused multi-Modal and Temporal Features. Columbia University ADVENT Technical Report #204-2004-4 Columbia University, September 2004.

We provide an integrated approach for detecting commercial segments in video streams. This approach systematically fuses the "local" multi-modal characteristics of commercials in the context of their "global" temporal behavior throughout the video stream. Discriminative classifiers are employed to distinguish between commercial and program segments based on their local multi-modal features. The decisions made by different discriminators are fused using a Support Vector Machine. The fusion results are then used as the probabilistic outcomes of a generative model describing the transitions between the commercial and program segments, with explicit models for the inter-arrival times of the commercial segments throughout the video. This approach aims to enhance upon the simple, yet effective blank frames, which usually indicate the start of commercials. It also provides acceptable performance when such indicators do not exist in the program stream. The results of comprehensive experiments on a heterogeneous data set of 36 hours of video taken from 6 different sources are reported. Our method provides almost 92% correct detection of the commercial segments and 8% enhancement over just using the blank indicators. For the case when blank indicators do not exist, our approach results in almost 85% correct detection


Masami Mizutani
Shahram Ebadollahi
Shih-Fu Chang

