Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Jun Wang, Sanjiv Kumar, Shih-Fu Chang. Semi-Supervised Hashing for Large Scale Search. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2012.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Hashing based approximate nearest neighbor (ANN) search in huge databases has become popular owing to its computational and memory efficiency. The popular hashing methods, e.g., Locality Sensitive Hashing and Spectral Hashing, construct hash functions based on random or principal projections. The resulting hashes are either not very accurate or inefficient. Moreover these methods are designed for a given metric similarity. On the contrary, semantic similarity is usually given in terms of pairwise labels of samples. There exist supervised hashing methods that can handle such semantic similarity but they are prone to overfitting when labeled data is small or noisy. In this work, we propose a semi-supervised hashing (SSH) framework that minimizes empirical error over the labeled set and an information theoretic regularizer over both labeled and unlabeled set. Based on this framework, we present three different semi-supervised hashing methods, including orthogonal hashing, non-orthogonal hashing, and sequential hashing. Particularly, the sequential hashing method generates robust codes in which each hash function is designed to correct the errors made by the previous ones. We further show that the sequential learning paradigm can be extended to unsupervised domains where no labeled pairs are available. Extensive experiments on four large datasets (up to 80 million samples) demonstrate the superior performance of the proposed SSH methods over state-of-the-art supervised and unsupervised hashing techniques


Jun Wang
Shih-Fu Chang

BibTex Reference

   Author = {Wang, Jun and Kumar, Sanjiv and Chang, Shih-Fu},
   Title = {Semi-Supervised Hashing for Large Scale Search},
   Journal = {Pattern Analysis and Machine Intelligence, IEEE Transactions on},
   Year = {2012}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).