LGMSOCMar 18, 2024

PETScML: Second-order solvers for training regression problems in Scientific Machine Learning

arXiv:2403.12188v16 citationsh-index: 23PASC
Originality Incremental advance
AI Analysis

This work addresses the training inefficiency in scientific machine learning by bridging deep-learning software with optimization solvers, offering a domain-specific improvement for researchers and engineers in computational science.

The authors tackled the problem of training neural networks in scientific machine learning by introducing a software framework that enables the use of conventional second-order solvers, demonstrating that these methods outperform adaptive first-order methods in terms of cost or accuracy for regression tasks across various test cases.

In recent years, we have witnessed the emergence of scientific machine learning as a data-driven tool for the analysis, by means of deep-learning techniques, of data produced by computational science and engineering applications. At the core of these methods is the supervised training algorithm to learn the neural network realization, a highly non-convex optimization problem that is usually solved using stochastic gradient methods. However, distinct from deep-learning practice, scientific machine-learning training problems feature a much larger volume of smooth data and better characterizations of the empirical risk functions, which make them suited for conventional solvers for unconstrained optimization. We introduce a lightweight software framework built on top of the Portable and Extensible Toolkit for Scientific computation to bridge the gap between deep-learning software and conventional solvers for unconstrained minimization. We empirically demonstrate the superior efficacy of a trust region method based on the Gauss-Newton approximation of the Hessian in improving the generalization errors arising from regression tasks when learning surrogate models for a wide range of scientific machine-learning techniques and test cases. All the conventional second-order solvers tested, including L-BFGS and inexact Newton with line-search, compare favorably, either in terms of cost or accuracy, with the adaptive first-order methods used to validate the surrogate models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes