LG AIJan 26, 2023

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

arXiv:2301.10886v513.718 citationsh-index: 58Has Code

Originality Incremental advance

AI Analysis

This work addresses exploration challenges in reinforcement learning for AI agents, but it is incremental as it builds on existing intrinsic reward methods with adaptive selection.

The paper tackles the problem of exploration in deep reinforcement learning by introducing AIRS, an automatic intrinsic reward shaping method that adaptively selects shaping functions based on estimated task return, resulting in superior performance on tasks from MiniGrid, Procgen, and DeepMind Control Suite compared to benchmarking schemes.

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL). More specifically, AIRS selects shaping function from a predefined set based on the estimated task return in real-time, providing reliable exploration incentives and alleviating the biased objective problem. Moreover, we develop an intrinsic reward toolkit to provide efficient and reliable implementations of diverse intrinsic reward approaches. We test AIRS on various tasks of MiniGrid, Procgen, and DeepMind Control Suite. Extensive simulation demonstrates that AIRS can outperform the benchmarking schemes and achieve superior performance with simple architecture.

View on arXiv PDF Code

Similar