Near-optimal inference in adaptive linear regression
This work addresses the challenge of reliable statistical inference in adaptive data collection settings, such as bandits and active learning, which is crucial for practitioners in machine learning and statistics.
The paper tackles the problem of non-normal asymptotic behavior in ordinary least squares when data is collected adaptively, which leads to erroneous hypothesis tests and confidence intervals. The authors propose online debiasing estimators that correct these issues, establish asymptotic normality and exact confidence intervals, and prove minimax lower bounds, with applications in multi-armed bandits, autoregressive time series, and active learning.
When data is collected in an adaptive manner, even simple methods like ordinary least squares can exhibit non-normal asymptotic behavior. As an undesirable consequence, hypothesis tests and confidence intervals based on asymptotic normality can lead to erroneous results. We propose a family of online debiasing estimators to correct these distributional anomalies in least squares estimation. Our proposed methods take advantage of the covariance structure present in the dataset and provide sharper estimates in directions for which more information has accrued. We establish an asymptotic normality property for our proposed online debiasing estimators under mild conditions on the data collection process and provide asymptotically exact confidence intervals. We additionally prove a minimax lower bound for the adaptive linear regression problem, thereby providing a baseline by which to compare estimators. There are various conditions under which our proposed estimators achieve the minimax lower bound. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.