OCLGSYDSNov 24, 2020

Safely Learning Dynamical Systems from Short Trajectories

arXiv:2011.12257v15 citations
Originality Highly original
AI Analysis

This work provides a formal definition and algorithmic approaches for safely learning dynamical systems, which is crucial for applications where exploration must not lead to unsafe states, particularly for control engineers working with unknown systems.

This paper addresses the challenge of safely learning dynamical systems by defining a framework for sequential trajectory initialization that ensures the system state remains within a safety region under all consistent dynamics. For linear systems, they propose an LP-based algorithm for learning from length-one trajectories or certifying impossibility, and provide an SDP representation for safe length-two trajectories. For nonlinear systems, they offer an SOCP representation for safe single-step trajectories.

A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety. In this work, we formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory. In our framework, the state of the system is required to stay within a given safety region under the (possibly repeated) action of all dynamical systems that are consistent with the information gathered so far. For our first two results, we consider the setting of safely learning linear dynamics. We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible. We also give an efficient semidefinite representation of the set of initial conditions whose resulting trajectories of length two are guaranteed to stay in the safety region. For our final result, we study the problem of safely learning a nonlinear dynamical system. We give a second-order cone programming based representation of the set of initial conditions that are guaranteed to remain in the safety region after one application of the system dynamics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes