LGMar 31, 2022

Minimum mean-squared error estimation with bandit feedback

Ayon Ghosh, L. A. Prashanth, Dipayan Sen, Aditya Gopalan

arXiv:2203.16810v41.8

Originality Incremental advance

AI Analysis

This addresses a statistical estimation challenge with bandit feedback, which is incremental as it builds on existing MSE and bandit frameworks.

The paper tackles the problem of sequentially learning to estimate a Gaussian vector's mean squared error (MSE) with limited observations per round, proposing non-adaptive and adaptive estimators and an algorithm to find the MSE-optimal subset, with derived concentration bounds and a minimax lower bound.

We consider the problem of sequentially learning to estimate, in the mean squared error (MSE) sense, a Gaussian $K$-vector of unknown covariance by observing only $m < K$ of its entries in each round. We propose two MSE estimators, and analyze their concentration properties. The first estimator is non-adaptive, as it is tied to a predetermined $m$-subset and lacks the flexibility to transition to alternative subsets. The second estimator, which is derived using a regression framework, is adaptive and exhibits better concentration bounds in comparison to the first estimator. We frame the MSE estimation problem with bandit feedback, where the objective is to find the MSE-optimal subset with high confidence. We propose a variant of the successive elimination algorithm to solve this problem. We also derive a minimax lower bound to understand the fundamental limit on the sample complexity of this problem.

View on arXiv PDF

Similar