LGAICVApr 7, 2024

Data Stream Sampling with Fuzzy Task Boundaries and Noisy Labels

arXiv:2404.04871v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses reliability and fairness issues in continual learning for scenarios with noisy data streams, representing an incremental improvement over existing approaches.

The paper tackles the problem of noisy labels in continual learning data streams with fuzzy task boundaries by introducing Noisy Test Debiasing (NTD), achieving over two times training speedup, maintained or improved accuracy, and using less than one-fifth of the GPU memory compared to prior methods.

In the realm of continual learning, the presence of noisy labels within data streams represents a notable obstacle to model reliability and fairness. We focus on the data stream scenario outlined in pertinent literature, characterized by fuzzy task boundaries and noisy labels. To address this challenge, we introduce a novel and intuitive sampling method called Noisy Test Debiasing (NTD) to mitigate noisy labels in evolving data streams and establish a fair and robust continual learning algorithm. NTD is straightforward to implement, making it feasible across various scenarios. Our experiments benchmark four datasets, including two synthetic noise datasets (CIFAR10 and CIFAR100) and real-world noise datasets (mini-WebVision and Food-101N). The results validate the efficacy of NTD for online continual learning in scenarios with noisy labels in data streams. Compared to the previous leading approach, NTD achieves a training speedup enhancement over two times while maintaining or surpassing accuracy levels. Moreover, NTD utilizes less than one-fifth of the GPU memory resources compared to previous leading methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes