LG MLFeb 21, 2020

On the Search for Feedback in Reinforcement Learning

Ran Wang, Karthikeya S. Parunandi, Aayushman Sharma, Raman Goyal, Suman Chakravorty

arXiv:2002.09478v63.319 citationsh-index: 25

Originality Incremental advance

AI Analysis

This work addresses efficiency and reliability issues in RL for dynamical systems, offering a novel method that could benefit applications in robotics and control, though it appears incremental as it builds on existing feedback law concepts.

The authors tackled the problem of high training times and variance in Reinforcement Learning (RL) for unknown nonlinear dynamical systems by proposing a local feedback representation instead of complex global parametrizations. They demonstrated that this approach results in highly efficient training, repeatable and reliable solutions, and superior closed-loop performance compared to state-of-the-art RL techniques, with the ability to recover global optimality through replanning.

The problem of Reinforcement Learning (RL) in an unknown nonlinear dynamical system is equivalent to the search for an optimal feedback law utilizing the simulations/ rollouts of the dynamical system. Most RL techniques search over a complex global nonlinear feedback parametrization making them suffer from high training times as well as variance. Instead, we advocate searching over a local feedback representation consisting of an open-loop sequence, and an associated optimal linear feedback law completely determined by the open-loop. We show that this alternate approach results in highly efficient training, the answers obtained are repeatable and hence reliable, and the resulting closed performance is superior to global state-of-the-art RL techniques. Finally, if we replan, whenever required, which is feasible due to the fast and reliable local solution, it allows us to recover global optimality of the resulting feedback law.

View on arXiv PDF

Similar