Policy Search in Continuous Action Domains: an Overview
This is an incremental overview paper that synthesizes existing research for researchers in reinforcement learning and policy search.
The paper provides a unified survey of policy search methods in continuous action domains, covering deep reinforcement learning, evolutionary algorithms, Bayesian optimization, and directed exploration, and outlines factors affecting sample efficiency.
Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms. In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods. The main message of this overview is in the relationship between the families of methods, but we also outline some factors underlying sample efficiency properties of the various approaches.