CVApr 19

Dual Strategies for Test-Time Adaptation

arXiv:2604.1754258.2h-index: 10
Predicted impact top 59% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners deploying models under distribution shifts, DualTTA offers a more robust TTA method that better leverages test data, though it is an incremental improvement over existing entropy-based TTA approaches.

DualTTA improves test-time adaptation under distribution shifts by using a larger, more diverse set of test samples, adaptively selecting reliable and unreliable groups via a new stability-based criterion, and applying opposing entropy objectives to each, achieving provably more effective model updates.

Conventional test-time adaptation (TTA) approaches typically adapt the model using only a small fraction of test samples, often those with low-entropy predictions, thereby failing to fully leverage the available information in the test distribution. This paper introduces DualTTA, a novel framework that improves performance under distribution shifts by utilizing a larger and more diverse set of test samples. DualTTA identifies two distinct groups: one where the model's predictions are likely consistent with the underlying semantics, and another where predictions are likely incorrect. For the first group, it minimizes prediction entropy to reinforce reliable decisions; for the second, it maximizes entropy to suppress overconfident errors and unlearn spurious behavior. These groups are adaptively selected using a new reliability criterion that measures prediction stability under both semantic-preserving and semantic-altering transformations, addressing the limitations of purely entropy-based selection. We further provide theoretical analysis and empirical justification showing that our approach enables a tighter separation between reliable and unreliable samples, in the context of their suitability for adaptation, leading to provably more effective model updates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes