MLLGOct 4, 2013

Weakly supervised clustering: Learning fine-grained signals from coarse labels

arXiv:1310.1363v36 citations
Originality Incremental advance
AI Analysis

This addresses a practical data limitation for researchers and practitioners dealing with aggregated labels, though it appears incremental as it builds on existing clustering and weak supervision concepts.

The paper tackles the problem of classification with only average labels over subpopulations, framing it as weakly supervised clustering, and proposes three approaches including a latent variables model that performs well in experiments on elections and industry data.

Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a latent variables model that performs well in our experiments. We illustrate our methods on an analysis of aggregated elections data and an industry data set that was the original motivation for this research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes