LGAIMLJun 11, 2021

Online Continual Adaptation with Active Self-Training

arXiv:2106.06526v214 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of continual adaptation for machine learning models in dynamic environments with limited labels, representing an incremental improvement by combining online learning, self-training, and active querying.

The paper tackles the problem of models struggling with continual distribution shifts and expensive labeling in changing environments by proposing Online Self-Adaptive Mirror Descent (OSAMD), which enables online self-training from unlabeled data and active label queries, achieving an O(T^{2/3}) dynamic regret bound in the separable case and demonstrating favorable empirical performance on simulated and real-world data.

Models trained with offline data often suffer from continual distribution shifts and expensive labeling in changing environments. This calls for a new online learning paradigm where the learner can continually adapt to changing environments with limited labels. In this paper, we propose a new online setting -- Online Active Continual Adaptation, where the learner aims to continually adapt to changing distributions using both unlabeled samples and active queries of limited labels. To this end, we propose Online Self-Adaptive Mirror Descent (OSAMD), which adopts an online teacher-student structure to enable online self-training from unlabeled data, and a margin-based criterion that decides whether to query the labels to track changing distributions. Theoretically, we show that, in the separable case, OSAMD has an $O({T}^{2/3})$ dynamic regret bound under mild assumptions, which is aligned with the $Ω(T^{2/3})$ lower bound of online learning algorithms with full labels. In the general case, we show a regret bound of $O({T}^{2/3} + α^* T)$, where $α^*$ denotes the separability of domains and is usually small. Our theoretical results show that OSAMD can fast adapt to changing environments with active queries. Empirically, we demonstrate that OSAMD achieves favorable regrets under changing environments with limited labels on both simulated and real-world data, which corroborates our theoretical findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes