LG SY DATA-ANFeb 4, 2023

Benchmarking sparse system identification with low-dimensional chaos

Alan A. Kaptanoglu, Lanyue Zhang, Zachary G. Nicolaou, Urban Fasel, Steven L. Brunton

arXiv:2302.10787v117.048 citationsh-index: 78Has Code

Originality Synthesis-oriented

AI Analysis

This work provides a large-scale comparison for researchers in dynamical systems and sparse regression, but it is incremental as it focuses on benchmarking existing methods rather than introducing new ones.

The authors benchmarked sparse system identification methods using a standardized database of chaotic systems, finding that the original SINDy algorithm and a recent mixed-integer discrete variant performed strongly, with the weak SINDy formulation showing significant improvements over traditional methods even on clean data.

Sparse system identification is the data-driven process of obtaining parsimonious differential equations that describe the evolution of a dynamical system, balancing model complexity and accuracy. There has been rapid innovation in system identification across scientific domains, but there remains a gap in the literature for large-scale methodological comparisons that are evaluated on a variety of dynamical systems. In this work, we systematically benchmark sparse regression variants by utilizing the dysts standardized database of chaotic systems. In particular, we demonstrate how this open-source tool can be used to quantitatively compare different methods of system identification. To illustrate how this benchmark can be utilized, we perform a large comparison of four algorithms for solving the sparse identification of nonlinear dynamics (SINDy) optimization problem, finding strong performance of the original algorithm and a recent mixed-integer discrete algorithm. In all cases, we used ensembling to improve the noise robustness of SINDy and provide statistical comparisons. In addition, we show very compelling evidence that the weak SINDy formulation provides significant improvements over the traditional method, even on clean data. Lastly, we investigate how Pareto-optimal models generated from SINDy algorithms depend on the properties of the equations, finding that the performance shows no significant dependence on a set of dynamical properties that quantify the amount of chaos, scale separation, degree of nonlinearity, and the syntactic complexity.

View on arXiv PDF

Similar