CVAILGMar 18, 2025

Effortless Active Labeling for Long-Term Test-Time Adaptation

arXiv:2503.14564v17 citationsh-index: 4CVPR
Originality Incremental advance
AI Analysis

This work addresses the annotation burden in long-term test-time adaptation for machine learning models, offering an incremental improvement over existing active labeling approaches.

The paper tackles error accumulation in long-term test-time adaptation by proposing an active labeling method that selects at most one sample per batch for annotation, based on identifying samples at the domain border using feature perturbation and balancing gradient impacts with dynamic weights. Experiments on ImageNet-C, -R, -K, -A, and PACS show it outperforms state-of-the-art methods with significantly lower annotation costs.

Long-term test-time adaptation (TTA) is a challenging task due to error accumulation. Recent approaches tackle this issue by actively labeling a small proportion of samples in each batch, yet the annotation burden quickly grows as the batch number increases. In this paper, we investigate how to achieve effortless active labeling so that a maximum of one sample is selected for annotation in each batch. First, we annotate the most valuable sample in each batch based on the single-step optimization perspective in the TTA context. In this scenario, the samples that border between the source- and target-domain data distributions are considered the most feasible for the model to learn in one iteration. Then, we introduce an efficient strategy to identify these samples using feature perturbation. Second, we discover that the gradient magnitudes produced by the annotated and unannotated samples have significant variations. Therefore, we propose balancing their impact on model optimization using two dynamic weights. Extensive experiments on the popular ImageNet-C, -R, -K, -A and PACS databases demonstrate that our approach consistently outperforms state-of-the-art methods with significantly lower annotation costs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes