LG AISep 16, 2016

Exploration Potential

arXiv:1609.04994v32.72 citations

Originality Incremental advance

AI Analysis

This addresses the problem of efficient exploration in reinforcement learning for agents, though it appears incremental as it builds on existing concepts like information gain.

The paper introduces exploration potential, a measure of how much a reinforcement learning agent has explored its environment class, which accounts for reward structure and is necessary and sufficient for asymptotic optimality. Experiments in multi-armed bandits demonstrate its use in analyzing exploration-exploitation tradeoffs.

We introduce exploration potential, a quantity that measures how much a reinforcement learning agent has explored its environment class. In contrast to information gain, exploration potential takes the problem's reward structure into account. This leads to an exploration criterion that is both necessary and sufficient for asymptotic optimality (learning to act optimally across the entire environment class). Our experiments in multi-armed bandits use exploration potential to illustrate how different algorithms make the tradeoff between exploration and exploitation.

View on arXiv PDF

Similar