Summary
The motivation of
this project is to summarize the sports video using score text detection
and recognition. We investigated the property of the captioned text in
sports video : the stationary property of caption box and the time transition
rules of the gamestat characters. The first property is utilized to enhance
the caption box detection performance by filtering out false alarms. The
later property is modeled as a transition graph model which significantly
enhances the recognition performance.
The detection algorithms
use features extracted from the compressed domain and are able to achieve
real-time speed while maintaining high accuracy -- 98% precision and 99%
recall. The recognition performance achieved 92% percent accuracy by using
word dictionary enhancement and temporal transition graph model.
We applied the above
character recognition system to baseball video summarization system. Text
box location is automatically determined using an initialization process,
in which spatio-temporal consistency of candidate location is checked.
Domain knowledge is further applied to categorize the words detected in
the text box (e.g., strike-ball count, score, runner position etc). Finally,
event is detected and annotated using text change detection and recognition.
Currently, the system detects two types of event in baseball video : Pitch
and Score. With these, the system can provide random access functions
to the start/end of each pitch or score in long sports video programs.
A snapshot of the sports highlight summarization system is shown below.
People
Dongqing
Zhang, Professor Shih-Fu
Chang
Demo
http://www.ee.columbia.edu/~dqzhang/SportsSummarize.html
Publication
D. Zhang, R. Kumar Rajendran,
and S.-F. Chang, General
and Domain-Specific Techniques for Detecting and Recognizing Superimposed
Text in Video, International Conference on Image Processing (ICIP-2002),
Rochester, New York, USA, Sep 22-25, 2002.
(PS.GZ/PDF)
D. Zhang, S.-F. Chang, Event
Detection in Baseball Video Using Superimposed Caption Recognition,
Proceeding of the ACM Multimedia, Jean Les Pins, France, December 1-6,
2002.
(PS.GZ/PDF)
Links
DVMM
Project: Statistical Fusion of Knowledge Soures and Transcript Streams
for Improved Text Recognition
For problems or questions
regarding this web site contact The
Web Master.
Last updated: June 12, 2002.
|