Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |

xd:event

Dong Xu, Shih-Fu Chang. Visual Event Recognition in News Video using Kernel Methods with Multi-Level Temporal Alignment. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, USA, June 2007.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract

In this work, we systematically study the problem of visual event recognition in unconstrained news video sequences. We adopt the discriminative kernel-based method for which video clip similarity plays an important role. First, we represent a video clip as a bag of orderless descriptors extracted from all of the constituent frames and apply Earth Mover¡¯s Distance (EMD) to integrate similarities among frames from two clips. Observing that a video clip is usually comprised of multiple sub-clips corresponding to event evolution over time, we further build a multilevel temporal pyramid. At each pyramid level, we integrate the information from different sub-clips with Integer-valueconstrained EMD to explicitly align the sub-clips. By fusing the information from the different pyramid levels, we develop Temporally Aligned Pyramid Matching (TAPM) for measuring video similarity. We conduct comprehensive experiments on the Trecvid 2005 corpus, which contains more than 6,800 clips. Our experiments demonstrate that 1) the TAPM multi-level method clearly outperforms single-level EMD, and 2) single-level EMD outperforms by a large margin (43.0% in Mean Average Precision) basic detection methods that use only a single key-frame. Extensive analysis of the results also reveals an intuitive interpretation of subclip alignment at different levels

Contact

Dong Xu
Shih-Fu Chang

BibTex Reference

@InProceedings{xd:event,
   Author = {Xu, Dong and Chang, Shih-Fu},
   Title = {Visual Event Recognition in News Video using Kernel Methods with Multi-Level Temporal Alignment},
   BookTitle = {IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)},
   Address = {Minneapolis, USA},
   Month = {June},
   Year = {2007}
}

EndNote Reference [help]

Get EndNote Reference (.ref)

 
bar

For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).