Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Dongqing Zhang, Rajendran Kumar Rajendran, Shih-Fu Chang. General and Domain-Specific Techniques for Detecting and Recognizing Superimposed Text in Video. In IEEE International Conference on Image Processing (ICIP), Rochester, New York, September 2002.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


We have developed generic and domain-specific video algorithms for caption text extraction and recognition in digital video. Our system includes several unique features: for caption box location, we combine the compressed-domain features derived from DCT coefficients and motion vectors. Long-term temporal consistency is employed to enhance localization performance. For character segmentation, we use a single-pass threshold free approach combining classification and projection to address noisy segmentation, text intensity variation, and algorithm complexity. In recognition, we use Zernike moments to achieve more accurate recognition performance. Finally, domain knowledge is explored and a statistical transition graph model is used to enhance recognition of domain-specific characters, such as ball counts and game score of baseball videos. The algorithms achieved real-time speed and significantly improved recognition accuracy. Furthermore, although the experiments were conducted in baseball videos only, these algorithms (except the transition model) are general and can be used in other applications, such as news and films


Dongqing Zhang
Shih-Fu Chang

BibTex Reference

   Author = {Zhang, Dongqing and Kumar Rajendran, Rajendran and Chang, Shih-Fu},
   Title = {General and Domain-Specific Techniques for Detecting and Recognizing Superimposed Text in Video},
   BookTitle = {IEEE International Conference on Image Processing (ICIP)},
   Address = {Rochester, New York},
   Month = {September},
   Year = {2002}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).