ML LG OCJun 20, 2019

Online A-Optimal Design and Active Linear Regression

Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

arXiv:1906.08509v26.46 citations

Originality Incremental advance

AI Analysis

This work addresses efficient resource allocation in statistical learning for decision-makers, offering incremental improvements in active linear regression under heteroscedasticity.

The paper tackles the problem of optimal experiment design for linear regression under heteroscedastic noise, where covariate variances are unknown, by proposing an active sampling algorithm to minimize the expected squared error of parameter estimates. It achieves a regret bound of O(T^{-2}) in certain settings, improving upon prior results, with numerical validation.

We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate $\hatβ$ of the hidden parameter $β^{\star}$ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision maker is then to figure out on the fly the optimal way to allocate the total budget of $T$ samples between covariates, as sampling several times a specific one will reduce the variance of the estimated model around it (but at the cost of a possible higher variance elsewhere). By trying to minimize the $\ell^2$-loss $\mathbb{E} [\lVert\hatβ-β^{\star}\rVert^2]$ the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design. Combining techniques from bandit and convex optimization we propose a new active sampling algorithm and we compare it with existing ones. We provide theoretical guarantees of this algorithm in different settings, including a $\mathcal{O}(T^{-2})$ regret bound in the case where the covariates form a basis of the feature space, generalizing and improving existing results. Numerical experiments validate our theoretical findings.

View on arXiv PDF

Similar