LG ST MLMay 31, 2025

Active Learning via Regression Beyond Realizability

Atul Ganju, Shashaank Aiyer, Ved Sriraman, Karthik Sridharan

arXiv:2506.00316v14.1h-index: 2

Originality Incremental advance

AI Analysis

This work addresses a limitation in active learning for practical, misspecified scenarios, though it appears incremental as it builds on existing surrogate-based frameworks.

The paper tackles the problem of active learning for multiclass classification without requiring the standard realizability assumption, showing that under weaker conditions and convex model classes, comparable label and sample complexity can be achieved, with prior methods failing in these non-realizable settings.

We present a new active learning framework for multiclass classification based on surrogate risk minimization that operates beyond the standard realizability assumption. Existing surrogate-based active learning algorithms crucially rely on realizability$\unicode{x2014}$the assumption that the optimal surrogate predictor lies within the model class$\unicode{x2014}$limiting their applicability in practical, misspecified settings. In this work we show that under conditions significantly weaker than realizability, as long as the class of models considered is convex, one can still obtain a label and sample complexity comparable to prior work. Despite achieving similar rates, the algorithmic approaches from prior works can be shown to fail in non-realizable settings where our assumption is satisfied. Our epoch-based active learning algorithm departs from prior methods by fitting a model from the full class to the queried data in each epoch and returning an improper classifier obtained by aggregating these models.

View on arXiv PDF

Similar