Balancing a CartPole System with Reinforcement Learning -- A Tutorial
This is an incremental tutorial for beginners in reinforcement learning, demonstrating practical implementations on a standard benchmark.
The paper implements and compares various reinforcement learning algorithms on the CartPole problem, finding that DQN with prioritized experience replay achieves the best performance by solving the task within 150 episodes.
In this paper, we provide the details of implementing various reinforcement learning (RL) algorithms for controlling a Cart-Pole system. In particular, we describe various RL concepts such as Q-learning, Deep Q Networks (DQN), Double DQN, Dueling networks, (prioritized) experience replay and show their effect on the learning performance. In the process, the readers will be introduced to OpenAI/Gym and Keras utilities used for implementing the above concepts. It is observed that DQN with PER provides best performance among all other architectures being able to solve the problem within 150 episodes.