LGAIDec 21, 2022

Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening Search

arXiv:2212.10765v16 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient exploration in reinforcement learning for agents, though it is incremental as it builds on existing bonus methods.

The paper tackled the problem of inefficient reinforcement learning search by introducing intrinsic reward bonuses inspired by depth-first and breadth-first search, combined with gain scheduling from iterative deepening search. The method improved performance across six tasks, achieving high performance in all tasks.

This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order to efficiently facilitate reinforcement learning search. While various bonuses have been designed to date, they are analogous to the depth-first and breadth-first search algorithms in graph theory. This paper, therefore, first designs two bonuses for each of them. Then, a heuristic gain scheduling is applied to the designed bonuses, inspired by the iterative deepening search, which is known to inherit the advantages of the two search algorithms. The proposed method is expected to allow agent to efficiently reach the best solution in deeper states by gradually exploring unknown states. In three locomotion tasks with dense rewards and three simple tasks with sparse rewards, it is shown that the two types of bonuses contribute to the performance improvement of the different tasks complementarily. In addition, by combining them with the proposed gain scheduling, all tasks can be accomplished with high performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes