LGJan 4, 2023

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Modelling Approach

arXiv:2301.01405v31 citationsh-index: 5
Originality Highly original
AI Analysis

This addresses a foundational challenge in noisy label learning for deep learning applications, offering a theoretical guarantee without relying on heuristics.

The paper tackles the non-identifiability problem in learning from noisy labels by proving that clean labels can be estimated if there are at least 2C-1 i.i.d. noisy labels per instance, and it demonstrates accurate clean label estimation across benchmarks with competitive model performance.

Learning from noisy labels (LNL) is crucial in deep learning, in which one of the approaches is to identify clean-label samples from poorly-annotated datasets. Such an identification is challenging because the conventional LNL problem, which assumes only one noisy label per instance, is non-identifiable, i.e., clean labels cannot be estimated theoretically without additional heuristics. This paper presents a novel data-driven approach that addresses this issue without requiring any heuristics about clean samples. We discover that the LNL problem becomes identifiable if there are at least $2C - 1$ i.i.d. noisy labels per instance, where $C$ is the number of classes. Our finding relies on the assumption of i.i.d. noisy labels and multinomial mixture modelling, making it easier to interpret than previous studies that require full-rank noisy-label transition matrices. To fulfil this condition without additional manual annotations, we propose a method that automatically generates additional i.i.d. noisy labels through nearest neighbours. These noisy labels are then used in the Expectation-Maximisation algorithm to infer clean labels. Our method demonstrably estimates clean labels accurately across various label noise benchmarks, including synthetic, web-controlled, and real-world datasets. Furthermore, the model trained with our method performs competitively with many state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes