LGMay 31, 2022

k-Means Maximum Entropy Exploration

arXiv:2205.15623v412.416 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses exploration challenges for reinforcement learning agents in complex environments, representing an incremental improvement over existing maximum entropy methods.

The paper tackles the problem of exploration in high-dimensional, continuous spaces with sparse rewards in reinforcement learning by introducing an artificial curiosity algorithm that lower bounds an approximation to the entropy of the state visitation distribution, showing it is computationally efficient and competitive on benchmarks, especially in tasks where rewards are hard to find.

Exploration in high-dimensional, continuous spaces with sparse rewards is an open problem in reinforcement learning. Artificial curiosity algorithms address this by creating rewards that lead to exploration. Given a reinforcement learning algorithm capable of maximizing rewards, the problem reduces to finding an optimization objective consistent with exploration. Maximum entropy exploration uses the entropy of the state visitation distribution as such an objective. However, efficiently estimating the entropy of the state visitation distribution is challenging in high-dimensional, continuous spaces. We introduce an artificial curiosity algorithm based on lower bounding an approximation to the entropy of the state visitation distribution. The bound relies on a result we prove for non-parametric density estimation in arbitrary dimensions using k-means. We show that our approach is both computationally efficient and competitive on benchmarks for exploration in high-dimensional, continuous spaces, especially on tasks where reinforcement learning algorithms are unable to find rewards.

View on arXiv PDF

Similar