Sparsistent Model Discovery
This addresses the challenge of automated model discovery in scientific fields where data is limited and noisy, representing an incremental improvement with specific gains in robustness.
The paper tackles the problem of discovering partial differential equations from noisy spatio-temporal data by showing that existing sparse regression methods can fail due to violated irrepresentability conditions, and introduces a randomised adaptive Lasso integrated into DeepMod to recover nonlinear and chaotic PDEs with up to O(2) higher noise-to-sample ratios than state-of-the-art algorithms using a single set of hyperparameters.
Discovering the partial differential equations underlying spatio-temporal datasets from very limited and highly noisy observations is of paramount interest in many scientific fields. However, it remains an open question to know when model discovery algorithms based on sparse regression can actually recover the underlying physical processes. In this work, we show the design matrices used to infer the equations by sparse regression can violate the irrepresentability condition (IRC) of the Lasso, even when derived from analytical PDE solutions (i.e. without additional noise). Sparse regression techniques which can recover the true underlying model under violated IRC conditions are therefore required, leading to the introduction of the randomised adaptive Lasso. We show once the latter is integrated within the deep learning model discovery framework DeepMod, a wide variety of nonlinear and chaotic canonical PDEs can be recovered: (1) up to $\mathcal{O}(2)$ higher noise-to-sample ratios than state-of-the-art algorithms, (2) with a single set of hyperparameters, which paves the road towards truly automated model discovery.