LGOCMLFeb 26, 2021

Sparsity in long-time control of neural ODEs

arXiv:2102.13566v311 citations
Originality Incremental advance
AI Analysis

This addresses sparsity and stability in neural ODE control for machine learning, offering theoretical insights but is incremental as it builds on existing frameworks.

The paper tackles the problem of controlling neural ODEs with sparsity-inducing penalties, proving that optimal controls vanish after a stopping time, leading to ordered sparsity in residual networks, and provides a polynomial stability estimate for empirical risk.

We consider the neural ODE and optimal control perspective of supervised learning, with $\ell^1$-control penalties, where rather than only minimizing a final cost (the \emph{empirical risk}) for the state, we integrate this cost over the entire time horizon. We prove that any optimal control (for this cost) vanishes beyond some positive stopping time. When seen in the discrete-time context, this result entails an \emph{ordered} sparsity pattern for the parameters of the associated residual neural network: ordered in the sense that these parameters are all $0$ beyond a certain layer. Furthermore, we provide a polynomial stability estimate for the empirical risk with respect to the time horizon. This can be seen as a \emph{turnpike property}, for nonsmooth dynamics and functionals with $\ell^1$-penalties, and without any smallness assumptions on the data, both of which are new in the literature.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes