17.1LGApr 16
Natural gradient descent with momentumAnthony Nouy, Agustín Somacal
We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks. Natural gradient descent (NGD) for the optimization of a loss function can be seen as a preconditioned gradient descent where updates in the parameter space are driven by a functional perspective. In a spirit similar to Newton's method, a NGD step uses, instead of the Hessian, the Gram matrix of the generating system of the tangent space to the approximation manifold at the current iterate, with respect to a suitable metric. This corresponds to a locally optimal update in function space, following a projected gradient onto the tangent space to the manifold. Still, both gradient and natural gradient descent methods get stuck in local minima. Furthermore, when the model class is a nonlinear manifold or the loss function is not ideally conditioned (e.g., the KL-divergence for density estimation, or a norm of the residual of a partial differential equation in physics informed learning), even the natural gradient might yield non-optimal directions at each step. This work introduces a natural version of classical inertial dynamic methods like Heavy-Ball or Nesterov and show how it can improve the learning process when working with nonlinear model classes.
LGFeb 5, 2024
State estimation of urban air pollution with statistical, physical, and super-learning graph modelsMatthieu Dolbeault, Olga Mula, Agustín Somacal
We consider the problem of real-time reconstruction of urban air pollution maps. The task is challenging due to the heterogeneous sources of available data, the scarcity of direct measurements, the presence of noise, and the large surfaces that need to be considered. In this work, we introduce different reconstruction methods based on posing the problem on city graphs. Our strategies can be classified as fully data-driven, physics-driven, or hybrid, and we combine them with super-learning models. The performance of the methods is tested in the case of the inner city of Paris, France.
MLFeb 6, 2020
Uncovering differential equations from data with hidden variablesAgustín Somacal, Yamila Barrera, Leonardo Boechi et al.
SINDy is a method for learning system of differential equations from data by solving a sparse linear regression optimization problem [Brunton et al., 2016]. In this article, we propose an extension of the SINDy method that learns systems of differential equations in cases where some of the variables are not observed. Our extension is based on regressing a higher order time derivative of a target variable onto a dictionary of functions that includes lower order time derivatives of the target variable. We evaluate our method by measuring the prediction accuracy of the learned dynamical systems on synthetic data and on a real data-set of temperature time series provided by the Réseau de Transport d'Électricité (RTE). Our method provides high quality short-term forecasts and it is orders of magnitude faster than competing methods for learning differential equations with latent variables.