Implementing the Deep Q-Network
This work addresses reproducibility issues for researchers in deep reinforcement learning, but it is incremental as it builds on existing methods.
The paper tackled the challenge of replicating Deep Q-Network results by identifying and detailing key implementation parameters and software solutions, resulting in an improved implementation that works across multiple domains beyond the original Arcade Learning Environment.
The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research. However, replicating results for complex systems is often challenging since original scientific publications are not always able to describe in detail every important parameter setting and software engineering solution. In this paper, we present results from our work reproducing the results of the DQN paper. We highlight key areas in the implementation that were not covered in great detail in the original paper to make it easier for researchers to replicate these results, including termination conditions and gradient descent algorithms. Finally, we discuss methods for improving the computational performance and provide our own implementation that is designed to work with a range of domains, and not just the original Arcade Learning Environment [Bellemare et al., 2013].