CVLGMLJun 7, 2018

Dimensionality-Driven Learning with Noisy Labels

arXiv:1806.02612v2471 citations
Originality Highly original
AI Analysis

This addresses the challenge of noisy labels in datasets for machine learning practitioners, offering a novel method to improve robustness in training.

The paper tackles the problem of training deep neural networks on datasets with noisy labels by analyzing the dimensionality of representation subspaces, and demonstrates that their dimensionality-driven learning strategy achieves high tolerance to significant proportions of noisy labels.

Datasets with significant proportions of noisy (incorrect) class labels present challenges for training accurate Deep Neural Networks (DNNs). We propose a new perspective for understanding DNN generalization for such datasets, by investigating the dimensionality of the deep representation subspace of training samples. We show that from a dimensionality perspective, DNNs exhibit quite distinctive learning styles when trained with clean labels versus when trained with a proportion of noisy labels. Based on this finding, we develop a new dimensionality-driven learning strategy, which monitors the dimensionality of subspaces during training and adapts the loss function accordingly. We empirically demonstrate that our approach is highly tolerant to significant proportions of noisy labels, and can effectively learn low-dimensional local subspaces that capture the data distribution.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes