Structure Analysis of Sports Video Using Domain Models

Di Zhong and Shih-Fu Chang


In this paper, we present an effective framework for scene detection and structure analysis for sports videos, using tennis and baseball as examples. Sports video can be characterized by its predictable temporal syntax, recurrent events with consistent features, and a fixed number of views. Our approach combines domain-specific knowledge, supervised machine learning techniques, and automatic feature analysis at multiple levels. Real time processing performance is achieved by utilizing compressed-domain processing techniques. High accuracy in view recognition is achieved by using compressed-domain global features as prefilters and object-level refined analysis in the latter verification stage. Applications include high-level structure browsing/navigation, highlight generation, and mobile media filtering.