Efficient Alternative to Bag-of-words for Visual Recognition
Abstract
We present an efficient alternative to the traditional vocabulary based on
bag-of-visual words (BoW) used for visual classification tasks. Our
representation is both conceptually and computationally superior to the
bag-of-visual words: (1) We iteratively generate a Maximum Likelihood
estimate of an image given a set of characteristic features in contrast to
the BoW methods where an image is represented as a histogram of visual words,
(2) We randomly sample a set of characteristic features instead of employing
computation intensive clustering algorithms used during the vocabulary
generation step of BoW methods. Our comparable performance to the
state-of-the-art, on experiments over three challenging human action datasets
and an equally challenging scene categorization dataset demonstrates the
universal applicability of our method.
Method Summary and Results
that lack a dominant subject, such as land/seascapes, we crop or expand the
Code/Data
A subset of the data alongwith ground truth annotations are available here.
Relevant Publications
Subhabrata Bhattacharya, Rahul Sukthankar, Rong Jin, Mubarak Shah, "A
Probabilistic Representation for Efficient Large Scale Visual Recognition
Tasks", In Proc. of IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), Colorado Springs, USA, pp. 2593-2600, 2011.