LGAIROJun 17, 2023

Vanishing Bias Heuristic-guided Reinforcement Learning Algorithm

arXiv:2306.10216v12 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for reinforcement learning practitioners working on lunar lander tasks.

The paper tackled the problem of improving reinforcement learning in the lunar lander environment by proposing a new algorithm, Heuristic RL, that uses heuristics to guide early training while reducing human bias, and experiments showed promising results.

Reinforcement Learning has achieved tremendous success in the many Atari games. In this paper we explored with the lunar lander environment and implemented classical methods including Q-Learning, SARSA, MC as well as tiling coding. We also implemented Neural Network based methods including DQN, Double DQN, Clipped DQN. On top of these, we proposed a new algorithm called Heuristic RL which utilizes heuristic to guide the early stage training while alleviating the introduced human bias. Our experiments showed promising results for our proposed methods in the lunar lander environment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes