LGAIOCMLJul 6, 2021

A Short Note on the Relationship of Information Gain and Eluder Dimension

arXiv:2107.02377v110 citations
Originality Synthesis-oriented
AI Analysis

This clarifies theoretical relationships in bandit and reinforcement learning, but it is incremental as it builds on existing complexity measures without introducing new methods.

The paper demonstrates that eluder dimension and information gain are equivalent complexity measures for reproducing kernel Hilbert spaces, showing this equivalence through the shared use of the elliptic potential lemma in their analysis.

Eluder dimension and information gain are two widely used methods of complexity measures in bandit and reinforcement learning. Eluder dimension was originally proposed as a general complexity measure of function classes, but the common examples of where it is known to be small are function spaces (vector spaces). In these cases, the primary tool to upper bound the eluder dimension is the elliptic potential lemma. Interestingly, the elliptic potential lemma also features prominently in the analysis of linear bandits/reinforcement learning and their nonparametric generalization, the information gain. We show that this is not a coincidence -- eluder dimension and information gain are equivalent in a precise sense for reproducing kernel Hilbert spaces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes