Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Xi Zhou, Xiaodan Zhuang, Shuicheng Yan, Shih-Fu Chang, Mark Hasegawa-Johnson, Thomas S. Huang. SIFT-Bag kernel for video event analysis. In MM '08: Proceeding of the 16th ACM international conference on Multimedia, October 2008.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


In this work, we present a SIFT-Bag based generative-todiscriminative framework for addressing the problem of video event recognition in unconstrained news videos. In the generative stage, each video clip is encoded as a bag of SIFT feature vectors, the distribution of which is described by a Gaussian Mixture Models (GMM). In the discriminative stage, the SIFT-Bag Kernel is designed for characterizing the property of Kullback-Leibler divergence between the specialized GMMs of any two video clips, and then this kernel is utilized for supervised learning in two ways. On one hand, this kernel is further refined in discriminating power for centroid-based video event classification by using the Within-Class Covariance Normalization approach, which depresses the kernel components with high-variability for video clips of the same event. On the other hand, the SIFT-Bag Kernel is used in a Support VectorMachine for margin-based video event classification. Finally, the outputs from these two classifiers are fused together for final decision. The experiments on the TRECVID 2005 corpus demonstrate that the mean average precision is boosted from the best reported 38.2% in [36] to 60.4% based on our new framework


Shih-Fu Chang

BibTex Reference

   Author = {Zhou, Xi and Zhuang, Xiaodan and Yan, Shuicheng and Chang, Shih-Fu and Hasegawa-Johnson, Mark and Huang, Thomas S.},
   Title = {SIFT-Bag kernel for video event analysis},
   BookTitle = {MM '08: Proceeding of the 16th ACM international conference on Multimedia},
   Month = {October},
   Year = {2008}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).