Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Kuan-Ting Lai, Felix X. Yu, Ming-Syan Chen, Shih-Fu Chang. Video Event Detection by Inferring Temporal Instance Labels. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), oral, Columbus, OH, June 2014.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Video event detection allows intelligent indexing of video content based on events. Traditional approaches extract features from video frames or shots, then quantize and pool the features to form a single vector representation for the entire video. Though simple and efficient, the final pooling step may lead to loss of temporally local information, which is important in indicating which part in a long video signifies presence of the event. In this work, we propose a novel instance-based video event detection approach. We represent each video as multiple instances, defined as video segments of different temporal intervals. The objective is to learn an instance-level event detection model based on only video-level labels. To solve this problem, we propose a large-margin formulation which treats the instance labels as hidden latent variables, and simultaneously infers the instance labels as well as the instance-level classification model. Our framework infers optimal solutions that assume positive videos have a large number of positive instances while negative videos have the fewest ones. Extensive experiments on large-scale video event datasets demonstrate significant performance gains. The proposed method is al- so useful in explaining the detection results by localizing the temporal segments in a video which is responsible for the positive detection


FelixX. Yu
Shih-Fu Chang

BibTex Reference

   Author = {Lai, Kuan-Ting and Yu, Felix X. and Chen, Ming-Syan and Chang, Shih-Fu},
   Title = {Video Event Detection by Inferring Temporal Instance Labels},
   BookTitle = {IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), oral},
   Address = {Columbus, OH},
   Month = {June},
   Year = {2014}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).