Detection & Recognition of Captioned Text in Sports Video

Project's Home Page | Current Research Areas > Multimedia Indexing and Content Management >

Summary

The motivation of this project is to summarize the sports video using score text detection and recognition. We investigated the property of the captioned text in sports video : the stationary property of caption box and the time transition rules of the gamestat characters. The first property is utilized to enhance the caption box detection performance by filtering out false alarms. The later property is modeled as a transition graph model which significantly enhances the recognition performance.

The detection algorithms use features extracted from the compressed domain and are able to achieve real-time speed while maintaining high accuracy -- 98% precision and 99% recall. The recognition performance achieved 92% percent accuracy by using word dictionary enhancement and temporal transition graph model.

We applied the above character recognition system to baseball video summarization system. Text box location is automatically determined using an initialization process, in which spatio-temporal consistency of candidate location is checked. Domain knowledge is further applied to categorize the words detected in the text box (e.g., strike-ball count, score, runner position etc). Finally, event is detected and annotated using text change detection and recognition. Currently, the system detects two types of event in baseball video : Pitch and Score. With these, the system can provide random access functions to the start/end of each pitch or score in long sports video programs. A snapshot of the sports highlight summarization system is shown below.

People

Dongqing Zhang, Professor Shih-Fu Chang

Demo

http://www.ee.columbia.edu/~dqzhang/SportsSummarize.html

Publication

D. Zhang, R. Kumar Rajendran, and S.-F. Chang, General and Domain-Specific Techniques for Detecting and Recognizing Superimposed Text in Video, International Conference on Image Processing (ICIP-2002), Rochester, New York, USA, Sep 22-25, 2002.
(PS.GZ/PDF)

D. Zhang, S.-F. Chang, Event Detection in Baseball Video Using Superimposed Caption Recognition, Proceeding of the ACM Multimedia, Jean Les Pins, France, December 1-6, 2002.
(PS.GZ/PDF)

Links

DVMM Project: Statistical Fusion of Knowledge Soures and Transcript Streams for Improved Text Recognition

For problems or questions regarding this web site contact The Web Master.
Last updated: June 12, 2002.