LGRODec 21, 2020

Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement Learning

arXiv:2012.11662v24 citations
AI Analysis

This work provides an incremental step towards improving the robustness of reinforcement learning control policies for engineers and researchers by reducing the effective dimensionality of system dynamics.

This paper addresses the curse of dimensionality in machine learning-based feedback control by influencing the dimensionality of trajectories induced by model-free reinforcement learning agents. By adding a post-processing function to the reward signal, the authors show that the dimensionality of system trajectories can be reduced, leading to increased robustness against noise and push disturbances.

A key limitation in using various modern methods of machine learning in developing feedback control policies is the lack of appropriate methodologies to analyze their long-term dynamics, in terms of making any sort of guarantees (even statistically) about robustness. The central reasons for this are largely due to the so-called curse of dimensionality, combined with the black-box nature of the resulting control policies themselves. This paper aims at the first of these issues. Although the full state space of a system may be quite large in dimensionality, it is a common feature of most model-based control methods that the resulting closed-loop systems demonstrate dominant dynamics that are rapidly driven to some lower-dimensional sub-space within. In this work we argue that the dimensionality of this subspace is captured by tools from fractal geometry, namely various notions of a fractional dimension. We then show that the dimensionality of trajectories induced by model free reinforcement learning agents can be influenced adding a post processing function to the agents reward signal. We verify that the dimensionality reduction is robust to noise being added to the system and show that that the modified agents are more actually more robust to noise and push disturbances in general for the systems we examined.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes