LGAIJan 26, 2023

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

arXiv:2301.10886v518 citationsh-index: 58
Originality Incremental advance
AI Analysis

This work addresses exploration challenges in reinforcement learning for AI agents, but it is incremental as it builds on existing intrinsic reward methods with adaptive selection.

The paper tackles the problem of exploration in deep reinforcement learning by introducing AIRS, an automatic intrinsic reward shaping method that adaptively selects shaping functions based on estimated task return, resulting in superior performance on tasks from MiniGrid, Procgen, and DeepMind Control Suite compared to benchmarking schemes.

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL). More specifically, AIRS selects shaping function from a predefined set based on the estimated task return in real-time, providing reliable exploration incentives and alleviating the biased objective problem. Moreover, we develop an intrinsic reward toolkit to provide efficient and reliable implementations of diverse intrinsic reward approaches. We test AIRS on various tasks of MiniGrid, Procgen, and DeepMind Control Suite. Extensive simulation demonstrates that AIRS can outperform the benchmarking schemes and achieve superior performance with simple architecture.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes