LGOct 18, 2022

Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation

arXiv:2210.10019v411 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work provides theoretical insights into when and why different pseudo-labeling methods work in test-time adaptation, which is incremental for researchers in domain adaptation.

The paper tackles the problem of test-time adaptation under distribution shifts by theoretically analyzing gradient descent with hard and conjugate pseudo-labels for binary classification. It shows that under a Gaussian model with square loss, conjugate labels converge to an ε-optimal predictor, while hard labels fail.

We consider a setting that a model needs to adapt to a new domain under distribution shifts, given that only unlabeled test samples from the new domain are accessible at test time. A common idea in most of the related works is constructing pseudo-labels for the unlabeled test samples and applying gradient descent (GD) to a loss function with the pseudo-labels. Recently, \cite{GSRK22} propose conjugate labels, which is a new kind of pseudo-labels for self-training at test time. They empirically show that the conjugate label outperforms other ways of pseudo-labeling on many domain adaptation benchmarks. However, provably showing that GD with conjugate labels learns a good classifier for test-time adaptation remains open. In this work, we aim at theoretically understanding GD with hard and conjugate labels for a binary classification problem. We show that for square loss, GD with conjugate labels converges to an $ε$-optimal predictor under a Gaussian model for any arbitrarily small $ε$, while GD with hard pseudo-labels fails in this task. We also analyze them under different loss functions for the update. Our results shed lights on understanding when and why GD with hard labels or conjugate labels works in test-time adaptation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes