LGROMLMar 3, 2020

Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?

arXiv:2003.01629v263 citations
AI Analysis

This addresses sample efficiency issues in deep RL for real-world applications, but it is incremental as it builds on existing model-free methods with a novel feature extraction approach.

The paper tackles the problem of sample inefficiency in deep reinforcement learning by investigating whether increasing input dimensionality can improve performance and sample efficiency, and shows that using a high-dimensional representation from an online feature extractor network (OFENet) leads to more efficient learning and outperforms state-of-the-art algorithms in experiments.

Deep reinforcement learning (RL) algorithms have recently achieved remarkable successes in various sequential decision making tasks, leveraging advances in methods for training large deep networks. However, these methods usually require large amounts of training data, which is often a big problem for real-world applications. One natural question to ask is whether learning good representations for states and using larger networks helps in learning better policies. In this paper, we try to study if increasing input dimensionality helps improve performance and sample efficiency of model-free deep RL algorithms. To do so, we propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms. Even though the high dimensionality of input is usually supposed to make learning of RL agents more difficult, we show that the RL agents in fact learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations. We believe that stronger feature propagation together with larger networks (and thus larger search space) allows RL agents to learn more complex functions of states and thus improves the sample efficiency. Through numerical experiments, we show that the proposed method outperforms several other state-of-the-art algorithms in terms of both sample efficiency and performance. Codes for the proposed method are available at http://www.merl.com/research/license/OFENet .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes