Graph-Based Semi-Supervised Learning and Applications

Back to Project List



Graph-based semi-supervised learning (GSSL) provides a promising paradigm for modeling the manifold structures that often exist in massive data in high-dimensional spaces. It has been shown effective in propagating a limited amount of initial labels to a large amount of unlabeled data, matching the needs of many emerging applications such as image annotation and information retrieval. We have developed a family of techniques to solve the open problems such as unbalanced labels, contaminated noisy labels, and graph construction over gigantic datasets. We have applied such techniques to many real-world applications such as interactive image retrieval, noisy Web image reranking, bio-molecular cellular image mining, and brain machine interfaces for image retrieval. We have also combined the graph-based learning and hashing techniques to derive graph-based hashing codes for scalable similarity retrieval.



  1. Xiao-Ming Wu, Zhenguo Li, Shih-Fu Chang. Analyzing the Harmonic Structure in Graph-Based Learning. In Proceedings of Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 2013. [pdf]
  2. Xiao-Ming Wu, Zhenguo Li, Anthony Man-Cho So, John Wright, Shih-Fu Chang. Learning with Partially Absorbing Random Walks. In Proceedings of Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 2012. [pdf]
  3. Wei Liu, Jun Wang, Shih-Fu Chang. Robust and Scalable Graph-Based Semisupervised Learning. Proceedings of the IEEE, 2012. [pdf]
  4. Jun Wang, Tony Jebara, Shih-Fu Chang. Graph Transduction via Alternating Minimization. In International Conference on Machine Learning (ICML), Helsinki, Finland, July 2008. [pdf]
    We proposed a bi-variate alternate optimization technique to derive the optimal prediction function over graphs. It treated both the initial labels and predicted function as optimization variables. It was shown effective in handling unbalanced and noisy label conditions.
  5. Tony Jebara, Jun Wang, Shih-Fu Chang. Graph Construction and b-Matching for Semi-Supervised Learning. In International Conference on Machine Learning (ICML), Montreal, Canada, June 2009. [pdf]
    This paper shows graph construction using b-matching is better than those constructed by kNN.
  6. Jun Wang, Yu-Gang Jiang, Shih-Fu Chang. Label Diagnosis through Self Tuning for Web Image Search. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Miami Beach, Florida, USA, June 2009. [pdf]
    We demonstrated promising results in filtering and denoising the incorrectly labeled images from the Web by graph-based diagnosis and tuning.
  7. Wei Liu, Junfeng He, Shih-Fu Chang. Large Graph Construction for Scalable Semi-Supervised Learning. In the 27th International Conference on Machine Learning (ICML), Haifa, Israel, June 2010. [pdf][code]
    We proposed a highly efficient method using Anchor Graph for constructing sparse low-rank graphs and semi-supervised learning with only linear complexity over gigantic datasets.
  8. Wei Liu, Jun Wang, Sanjiv Kumar, Shih-Fu Chang. Hashing with Graphs. In International Conference on Machine Learning (ICML), Bellevue, WA, USA, 2011. [pdf]
    We applied the Anchor Graph method to derive the eigenfunction over large graphs without needing over-simplified assumptions about data distributions. We designed graph-based hashing for large-scale similarity retrieval and demonstrated a retrieval accuracy even better than that of L2 linear scan.
  9. Jun Wang, Shih-Fu Chang, Xiabo Zhou, T. C. Stephen Wong. Active Microscopic Cellular Image Annotation by Superposable Graph Transduction with Imbalanced Labels. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, USA, June 2008. [pdf]
    In this paper, we developed a real-time system for interactive cellular image annotation using the graph-based SSL method. We specifically applied graph superposition and normalization ideas to achieve the real-time speed and solve the class imbalance issue.
  10. Yu-Gang Jiang, Jun Wang, Shih-Fu Chang, Chong-Wah Ngo. Domain Adaptive Semantic Diffusion for Large Scale Context-Based Video Annotation. In International Conference on Computer Vision (ICCV), Kyoto, Japan, September 2009. [pdf]
    We applied graph-based diffusion to fuse results of individual concept detectors to improve the overall accuracy of image annotation.
  11. Jun Wang, Eric Pohlmeyer, Barbara Hanna, Yu-Gang Jiang, Paul Sajda, Shih-Fu Chang. Brain State Decoding for Rapid Image Retrieval. In Proceeding of the ACM international conference on Multimedia (ACM MM), October 2009. [pdf]
    We combine the EEG-based brain signal decoder and graph-based semi-supervised learning to detect arbitrary user-initiated search targets and retrieve relevant images from a large database using a brain machine interface.