Jump to : Download | Note | Abstract | Contact | BibTex reference | EndNote reference |


Go Irie, Dong Liu, Zhenguo Li, Shih-Fu Chang. A Bayesian Approach to Multimodal Visual Dictionary Learning. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Note on this paper

Supplementary material available here


Despite significant progress, most existing visual dictionary learning methods rely on image descriptors alone or together with class labels. However, Web images are often associated with text data which may carry substantial information regarding image semantics, and may be exploited for visual dictionary learning. This paper explores this idea by leveraging relational information between image descriptors and textual words via co-clustering, in addition to information of image descriptors. Existing co-clustering methods are not optimal for this problem because they ignore the structure of image descriptors in the continuous space, which is crucial for capturing visual characteristics of images. We propose a novel Bayesian co-clustering model to jointly estimate the underlying distributions of the continuous image descriptors as well as the relationship between such distributions and the textual words through a unified Bayesian inference. Extensive experiments on image categorization and retrieval have validated the substantial value of the proposed joint modeling in improving visual dictionary learning, where our model shows superior performance over several recent methods


Dong Liu
Zhenguo Li
Shih-Fu Chang

BibTex Reference

   Author = {Irie, Go and Liu, Dong and Li, Zhenguo and Chang, Shih-Fu},
   Title = {A Bayesian Approach to Multimodal Visual Dictionary Learning},
   BookTitle = {IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)},
   Address = {Portland, OR},
   Month = {June},
   Year = {2013}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).