Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |

dvmmPub12

Winston Hsu, Shih-Fu Chang. Generative, Discriminative, and Ensemble Learning on Multi-modal Perceptual Fusion toward News Video Story Segmentation. In IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, June 2004.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract

News video story segmentation is a critical task for automatic video indexing and summarization. Our prior work has demonstrated promising performance by using a generative model, called Maximum Entropy (ME), which models the posterior probability given the multi-modal perceptual features near the candidate points. In this paper, we investigate alternative statistical approaches based on discriminative models, i.e. Support Vector Machine (SVM), and Ensemble Learning, i.e. Boosting. In addition, we develop a novel approach, called BoostME, which uses the ME classifiers and the associated confidence scores in each boosting iteration. We evaluated these different methods using the TRECVID 2003 broadcast news data set. We found that SVM-based and ME-based techniques both outperformed the pure Boosting techniques, with the SVM-based solutions achieving even slightly higher accuracy. Moreover, we summarize extensive analysis results of error sources over distinctive news story types to identify future research opportunities

Contact

Winston Hsu
Shih-Fu Chang

BibTex Reference

@InProceedings{dvmmPub12,
   Author = {Hsu, Winston and Chang, Shih-Fu},
   Title = {Generative, Discriminative, and Ensemble Learning on Multi-modal Perceptual Fusion toward News Video Story Segmentation},
   BookTitle = {IEEE International Conference on Multimedia and Expo (ICME)},
   Address = {Taipei, Taiwan},
   Month = {June},
   Year = {2004}
}

EndNote Reference [help]

Get EndNote Reference (.ref)

 
bar

For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).