Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Dongqing Zhang, Shih-Fu Chang. A Bayesian Framework for Fusing Multiple Word Knowledge Models in Videotext Recognition. In IEEE Computer Vision and Pattern Recognition (CVPR), Madison, Wisconsin, June 2003.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Videotext recognition is challenging due to low resolution, diverse fonts/styles, and cluttered background. Past methods enhanced recognition by using multiple frame averaging, image interpolation and lexicon correction, but recognition using multi-modality language models has not been explored. In this paper, we present a formal Bayesian framework for videotext recognition by combining multiple knowledge using mixture models, and describe a learning approach based on Expectation-Maximization (EM). In order to handle unseen words, a back-off smoothing approach derived from the Bayesian model is also presented. We exploited a prototype that fuses the model from closed caption and that from the British National Corpus. The model from closed caption is based on a unique time distance distribution model of videotext words and closed caption words. Our method achieves a significant performance gain, with word recognition rate of 76.8% and character recognition rate of 86.7%. The proposed methods also reduce false videotext detection significantly, with a false alarm rate of 8.2% without substantial loss of recall


Dongqing Zhang
Shih-Fu Chang

BibTex Reference

   Author = {Zhang, Dongqing and Chang, Shih-Fu},
   Title = {A Bayesian Framework for Fusing Multiple Word Knowledge Models in Videotext Recognition},
   BookTitle = {IEEE Computer Vision and Pattern Recognition (CVPR)},
   Address = {Madison, Wisconsin},
   Month = {June},
   Year = {2003}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).