LGAIMay 25, 2025

Learning from M-Tuple Dominant Positive and Unlabeled Data

arXiv:2506.15686v2h-index: 22
Originality Incremental advance
AI Analysis

This addresses a practical limitation in label proportion learning for real-world applications where precise instance-level labels are unavailable, though it appears to be an incremental improvement over existing methods.

The paper tackles the challenge of learning from grouped data where only the proportion of positive instances in each group is known, proposing the MDPU framework that uses an unbiased risk estimator with risk correction to prevent overfitting. Experiments on multiple datasets demonstrate its effectiveness compared to baseline methods.

Label Proportion Learning (LLP) addresses the classification problem where multiple instances are grouped into bags and each bag contains information about the proportion of each class. However, in practical applications, obtaining precise supervisory information regarding the proportion of instances in a specific class is challenging. To better align with real-world application scenarios and effectively leverage the proportional constraints of instances within tuples, this paper proposes a generalized learning framework \emph{MDPU}. Specifically, we first mathematically model the distribution of instances within tuples of arbitrary size, under the constraint that the number of positive instances is no less than that of negative instances. Then we derive an unbiased risk estimator that satisfies risk consistency based on the empirical risk minimization (ERM) method. To mitigate the inevitable overfitting issue during training, a risk correction method is introduced, leading to the development of a corrected risk estimator. The generalization error bounds of the unbiased risk estimator theoretically demonstrate the consistency of the proposed method. Extensive experiments on multiple datasets and comparisons with other relevant baseline methods comprehensively validate the effectiveness of the proposed learning framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes