Amortized Variational Inference for Partial-Label Learning: A Probabilistic Approach to Label Disambiguation
This addresses noisy and ambiguous labeling in crowdsourcing and similar domains, offering a scalable and rigorous solution for training classifiers with partial labels.
The paper tackles the problem of partial-label learning, where each instance has multiple candidate labels but only one is correct, by introducing a probabilistic framework using amortized variational inference to approximate the true label posterior, achieving state-of-the-art performance in accuracy and efficiency on synthetic and real-world datasets.
Real-world data is frequently noisy and ambiguous. In crowdsourcing, for example, human annotators may assign conflicting class labels to the same instances. Partial-label learning (PLL) addresses this challenge by training classifiers when each instance is associated with a set of candidate labels, only one of which is correct. While early PLL methods approximate the true label posterior, they are often computationally intensive. Recent deep learning approaches improve scalability but rely on surrogate losses and heuristic label refinement. We introduce a novel probabilistic framework that directly approximates the posterior distribution over true labels using amortized variational inference. Our method employs neural networks to predict variational parameters from input data, enabling efficient inference. This approach combines the expressiveness of deep learning with the rigor of probabilistic modeling, while remaining architecture-agnostic. Theoretical analysis and extensive experiments on synthetic and real-world datasets demonstrate that our method achieves state-of-the-art performance in both accuracy and efficiency.