Lexing Xie, Peng Xu, Shih-Fu Chang, Ajay Divakaran, Huifang Sun
In this paper, we present statistical
techniques for parsing the structure of produced soccer programs. The problem
is important for applications such as personalized video streaming and browsing
systems, in which video are segmented into different states and important states
are selected based on user preferences. While prior work focuses on the detection
of special events such as goals or corner kicks, this paper is concerned with
generic structural elements of the game. We define two mutually exclusive states
of the game, play and break based on the rules of soccer. Automatic
detection of such generic states represents an original, challenging issue due
to high appearance diversities and temporal dynamics of such states in different
videos. We select a salient feature set from the compressed domain, dominant
color ratio and motion intensity, based on the special syntax and content characteristics
of soccer videos. We then model the stochastic structures of each state of the
game with a set of hidden Markov models. Finally, higher-level transitions are
taken into account and dynamic programming techniques are used to obtain the
maximum likelihood segmentation of the video sequence. The system achieves a
promising classification accuracy of
83.5%, with light-weight computation on feature extraction and model inference,
as well as a satisfactory accuracy in boundary timing.