SYLGDec 14, 2020

Non-linear State-space Model Identification from Video Data using Deep Encoders

arXiv:2012.07721v310 citations
AI Analysis

This work is significant for robotics, autonomous vehicles, and medical imaging, where systems are often measured by video streams, by enabling better system identification and prediction.

The paper addresses the challenge of identifying systems with high-dimensional video inputs and outputs by proposing a novel non-linear state-space identification method. It achieves low simulation error and excellent long-term prediction capability on a simulated environment of a controllable ball in a unit box.

Identifying systems with high-dimensional inputs and outputs, such as systems measured by video streams, is a challenging problem with numerous applications in robotics, autonomous vehicles and medical imaging. In this paper, we propose a novel non-linear state-space identification method starting from high-dimensional input and output data. Multiple computational and conceptual advances are combined to handle the high-dimensional nature of the data. An encoder function, represented by a neural network, is introduced to learn a reconstructability map to estimate the model states from past inputs and outputs. This encoder function is jointly learned with the dynamics. Furthermore, multiple computational improvements, such as an improved reformulation of multiple shooting and batch optimization, are proposed to keep the computational time under control when dealing with high-dimensional and large datasets. We apply the proposed method to a video stream of a simulated environment of a controllable ball in a unit box. The study shows low simulation error with excellent long term prediction capability of the model obtained using the proposed method.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes