Hashing for Large-Scale Matching and Retrieval |
||
|
||
|
SummaryWe are developing new hashing methods to solve the problem of finding nearest neighbors in gigantic datasets. Such techniques are needed in many important applications, such as content-based retrieval and matching of images and videos, matching of visual features in high-dimensional spaces (e.g., SIFT), and other applications involving millions or billions of samples. In several solutions, we try to find the optimal projections for generating the binary hash bits. In others, we exploit the strategies like semi-supervised learning, graph-based manifold representation, query-dependent adaptation, or joint speed-accuracy optimization to significantly improve the hashing performance. Recent Papers
Semi-Supervised Hashing [1] - In this work, we develop a semi-supervised hashing method that minimizes empirical error on the labeled data while maximizing variance and independence of hash bits over the labeled and unlabeled data. Sequential Projection Hashing [2] - In this paper, we develop a data-dependent projection learning method (similar to the concept of boosting) such that each hashing function is designed to correct the errors made by the previous one sequentially. Optimized Kernel Hashing [3] - In this paper, we develop a new hashing algorithm to create efficient codes for large scale data of general formats with any kernel function, including kernels on vectors, graphs, sequences, sets Query-Adaptive Hash-based Ranking [4] - One problem associated with hash-based ranking is the lacking of orders among images mapped to the same hash bin. In this paper, we develop an adaptive method that learns the optimal weights for each hash bit for a diverse set of predefined semantic concept classes. For a new query, adaptive weights are computed by evaluating the proximity between the query and the concept categories. Hashing with Jointly Optimized Speed and Accuracy [5] - In this paper, we develop a new scalable hashing algorithm with joint optimization of search accuracy and search time simultaneously. Our method generates compact hash codes for data of general formats with any similarity function. Hashing with Scalable Graphs [6] - Real-world datasets often reside on low-dimensional manifolds in high-dimensional spaces. In this paper, we use anchor graphs to represent the manifold structures in large-scale datasets. We develop graph-based hashing methods by computing the eigenvectors (and eigenfunctions) of graph Laplacian, without assuming restrictive probability distributions, and hierarchical hashing to address the rapid energy decay problem associated with typical spectral hashing approaches. PeopleShih-Fu Chang, Junfeng He, Yu-Gang Jiang, Wei Liu, Jun WangPublications
|