LG AI MLJan 10, 2025

On The Statistical Complexity of Offline Decision-Making

arXiv:2501.06339v12 citationsh-index: 6ICML

AI Analysis

It addresses foundational limits in offline RL for researchers, with incremental improvements in coverage analysis.

The paper tackles the statistical complexity of offline decision-making with function approximation, establishing near minimax-optimal rates for contextual bandits and MDPs, and introduces a new coverage characterization that subsumes prior notions.

We study the statistical complexity of offline decision-making with function approximation, establishing (near) minimax-optimal rates for stochastic contextual bandits and Markov decision processes. The performance limits are captured by the pseudo-dimension of the (value) function class and a new characterization of the behavior policy that \emph{strictly} subsumes all the previous notions of data coverage in the offline decision-making literature. In addition, we seek to understand the benefits of using offline data in online decision-making and show nearly minimax-optimal rates in a wide range of regimes.

View on arXiv PDF

Similar