This page is under construction. Please check back later for more information including code the future works...
We study the problem of learning with label proportions in which the training data is provided in groups and only the proportion of each class in each group is known. This learning setting has broad applications in data privacy, political science, healthcare, marketing and computer vision.
We propose a new method called proportion-SVM, or $\propto$SVM, which explicitly models the latent unknown instance labels together with the known group label proportions in a large-margin framework. Unlike the existing works, the approach avoids making restrictive assumptions about the data. The $\propto$SVM model leads to a non-convex integer programming problem. In order to solve it efficiently, we propose two algorithms: one based on simple alternating optimization and the other based on a convex relaxation. Extensive experiments on standard datasets show that $\propto$SVM outperforms the state-of-the-art, especially for larger group sizes.
Felix X. Yu; Dong Liu; Sanjiv Kumar; Tony Jebara; Shih-Fu Chang. $\propto$SVM for learning with label proportions ICML 2013 [PDF] [Supp] [arXiv] [Code]