LGFeb 3, 2025

Label Distribution Learning with Biased Annotations by Learning Multi-Label Representation

Zhiqiang Kou, Si Qin, Hailin Wang, Mingkun Xie, Shuo Chen, Yuheng Jia, Tongliang Liu, Masashi Sugiyama, Xin Geng

arXiv:2502.01170v114.47 citationsh-index: 9

Originality Incremental advance

AI Analysis

This work addresses the challenge of biased annotations in LDL, which is important for applications requiring accurate label distributions, but it is incremental as it builds on existing low-rank assumptions with a novel perspective.

The paper tackles the problem of inaccurate label distribution recovery in Label Distribution Learning (LDL) due to biased annotations by proposing a method that first degenerates soft label distributions into hard multi-hot labels and then recovers true label information, achieving improved performance on real-world datasets.

Multi-label learning (MLL) has gained attention for its ability to represent real-world data. Label Distribution Learning (LDL), an extension of MLL to learning from label distributions, faces challenges in collecting accurate label distributions. To address the issue of biased annotations, based on the low-rank assumption, existing works recover true distributions from biased observations by exploring the label correlations. However, recent evidence shows that the label distribution tends to be full-rank, and naive apply of low-rank approximation on biased observation leads to inaccurate recovery and performance degradation. In this paper, we address the LDL with biased annotations problem from a novel perspective, where we first degenerate the soft label distribution into a hard multi-hot label and then recover the true label information for each instance. This idea stems from an insight that assigning hard multi-hot labels is often easier than assigning a soft label distribution, and it shows stronger immunity to noise disturbances, leading to smaller label bias. Moreover, assuming that the multi-label space for predicting label distributions is low-rank offers a more reasonable approach to capturing label correlations. Theoretical analysis and experiments confirm the effectiveness and robustness of our method on real-world datasets.

View on arXiv PDF

Similar