LG AI MLDec 30, 2025

Interactive Machine Learning: From Theory to Scale

arXiv:2512.23924v14.1

Originality Highly original

AI Analysis

It addresses the high costs of data acquisition and trial-and-error in large-scale or high-stakes applications, providing foundational theoretical advances with practical deployment guidance.

This dissertation tackled the problem of expensive data labeling and decision-making in machine learning by developing interactive learning algorithms that actively guide information collection, achieving exponential label savings and efficient contextual bandit methods independent of action space size.

Machine learning has achieved remarkable success across a wide range of applications, yet many of its most effective methods rely on access to large amounts of labeled data or extensive online interaction. In practice, acquiring high-quality labels and making decisions through trial-and-error can be expensive, time-consuming, or risky, particularly in large-scale or high-stakes settings. This dissertation studies interactive machine learning, in which the learner actively influences how information is collected or which actions are taken, using past observations to guide future interactions. We develop new algorithmic principles and establish fundamental limits for interactive learning along three dimensions: active learning with noisy data and rich model classes, sequential decision making with large action spaces, and model selection under partial feedback. Our results include the first computationally efficient active learning algorithms achieving exponential label savings without low-noise assumptions; the first efficient, general-purpose contextual bandit algorithms whose guarantees are independent of the size of the action space; and the first tight characterizations of the fundamental cost of model selection in sequential decision making. Overall, this dissertation advances the theoretical foundations of interactive learning by developing algorithms that are statistically optimal and computationally efficient, while also providing principled guidance for deploying interactive learning methods in large-scale, real-world settings.

View on arXiv PDF

Similar