$\propto$SVM for learning with label proportions
This addresses a practical problem in machine learning for scenarios where individual labels are unavailable, offering an incremental improvement over existing methods.
The paper tackles the problem of learning with label proportions, where only group-level class proportions are known, by proposing the proportion-SVM method that models latent instance labels in a large-margin framework, and experiments show it outperforms state-of-the-art methods, particularly for larger group sizes.
We study the problem of learning with label proportions in which the training data is provided in groups and only the proportion of each class in each group is known. We propose a new method called proportion-SVM, or $\propto$SVM, which explicitly models the latent unknown instance labels together with the known group label proportions in a large-margin framework. Unlike the existing works, our approach avoids making restrictive assumptions about the data. The $\propto$SVM model leads to a non-convex integer programming problem. In order to solve it efficiently, we propose two algorithms: one based on simple alternating optimization and the other based on a convex relaxation. Extensive experiments on standard datasets show that $\propto$SVM outperforms the state-of-the-art, especially for larger group sizes.