Segmentation by Aggregating Superpixels

Back to Project List

Project Summary

Segmentation is commonly done by modeling low-level features (e.g. color and edges), where a large number of methods has been proposed, including Normalized Cut [Shi and Malik'00], Minimum Spanning Tree [Felzenszwalb and Huttenlocher'04], and Mean Shift [Comaniciu and Meer'02]. Despite substantial progress, the performance is still far from satisfactory. One major challenge lies in the fundamental difficulty in simultaneously capturing and modeling a variety of visual patterns co-occurred in a nature image, which may differ substantially in different images. In this project, we address this issue from a novel perspective, that instead of directly modeling low-level features, we propose to aggregate mid-level grouping cues in the form of superpixels which can be obtained easily by over-segmenting the image using any reasonable existing segmentation method. The key observation is that, generated by different algorithms with varying parameters, superpixels can capture multi-scale and diverse visual patterns in the image. Successful integration of the cues from a large multitude of superpixels presents a promising yet not fully explored direction. In this project, we propose a novel segmentation framework based on bipartite graph partitioning, which is able to aggregate multi-layer superpixels in a principled and very effective manner. Computationally, it is tailored to unbalanced bipartite graph structure and leads to a highly efficient, linear-time spectral algorithm. Our method achieves significantly better performance on the Berkeley Segmentation Database compared to state-of-the-art techniques.

Bipartite Graph and Transfer Cuts

Speedup by Tcut on Bipartite Graph Partitioning

Performance

Segmentations using different combinations of superpixels

MS - Mean Shift [Comaniciu and Meer'02];
FH - Efficient Graph-Based Segmentation [Felzenszwalb and Huttenlocher'04];
(3,MS&FH) - using three over-segmentations from each method.

Benefits of Adaptive Superpixels

Include large superpixels for complex images to suppress strong edges within objects.

Quantitative Comparisons on Berkeley Segmentation Database

PRI: Probabilistic Rand Index
VOI: Variation of Information
GCE: Global Consistency Error
BDE: Boundary Displacement Error
BFM: Boundary-based F measure
RSC: Region-wise segmentation covering

SAS takes 6.44s per image of size 481-by-321, where 4.11s for generating superpixels and 0.65s for Tcut. MNcut, MLSS, Ncut and TBES take more than 30s, 40s, 150s, and 500s, respectively.

Perceptual Comparisons

Download

Download request form

People

Zhenguo Li, Xiao-Ming Wu, Shih-Fu Chang

Publication

Zhenguo Li, Xiao-Ming Wu, Shih-Fu Chang. Segmentation Using Superpixels: A Bipartite Graph Partitioning Approach. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [pdf][Poster]