CV LGMay 23, 2024

Adaptive Retention & Correction: Test-Time Training for Continual Learning

Haoran Chen, Micah Goldblum, Zuxuan Wu, Yu-Gang Jiang

arXiv:2405.14318v46 citationsh-index: 57

Originality Incremental advance

AI Analysis

This addresses the challenge of continual learning without memory for AI systems that need to adapt over time, offering an incremental improvement with plug-and-play compatibility.

The paper tackles the problem of classification layer bias towards recent tasks in continual learning, particularly in memory-free environments, by proposing Adaptive Retention & Correction (ARC) for test-time adjustments, achieving average performance increases of 2.7% on CIFAR-100 and 2.6% on Imagenet-R when integrated with state-of-the-art methods.

Continual learning, also known as lifelong learning or incremental learning, refers to the process by which a model learns from a stream of incoming data over time. A common problem in continual learning is the classification layer's bias towards the most recent task. Traditionally, methods have relied on incorporating data from past tasks during training to mitigate this issue. However, the recent shift in continual learning to memory-free environments has rendered these approaches infeasible. In this study, we propose a solution focused on the testing phase. We first introduce a simple Out-of-Task Detection method, OTD, designed to accurately identify samples from past tasks during testing. Leveraging OTD, we then propose: (1) an Adaptive Retention mechanism for dynamically tuning the classifier layer on past task data; (2) an Adaptive Correction mechanism for revising predictions when the model classifies data from previous tasks into classes from the current task. We name our approach Adaptive Retention & Correction (ARC). While designed for memory-free environments, ARC also proves effective in memory-based settings. Extensive experiments show that our proposed method can be plugged in to virtually any existing continual learning approach without requiring any modifications to its training procedure. Specifically, when integrated with state-of-the-art approaches, ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets, respectively.

View on arXiv PDF

Similar