LG NAMay 23, 2024

Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

Chuqi Chen, Yahong Yang, Yang Xiang, Wenrui Hao

arXiv:2405.14099v412.519 citationsh-index: 5J Sci Comput

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficiently training neural networks for PDEs in science and engineering, though it is incremental as it builds on existing neural network approaches by focusing on differentiation methods.

The paper tackles the problem of training neural networks for solving partial differential equations (PDEs) by comparing automatic differentiation (AD) and finite difference (FD) methods, finding that AD outperforms FD in terms of training speed and residual loss, with quantitative metrics like truncated entropy used to demonstrate this advantage.

Neural network-based approaches have recently shown significant promise in solving partial differential equations (PDEs) in science and engineering, especially in scenarios featuring complex domains or incorporation of empirical data. One advantage of the neural network methods for PDEs lies in its automatic differentiation (AD), which necessitates only the sample points themselves, unlike traditional finite difference (FD) approximations that require nearby local points to compute derivatives. In this paper, we quantitatively demonstrate the advantage of AD in training neural networks. The concept of truncated entropy is introduced to characterize the training property. Specifically, through comprehensive experimental and theoretical analyses conducted on random feature models and two-layer neural networks, we discover that the defined truncated entropy serves as a reliable metric for quantifying the residual loss of random feature models and the training speed of neural networks for both AD and FD methods. Our experimental and theoretical analyses demonstrate that, from a training perspective, AD outperforms FD in solving PDEs.

View on arXiv PDF

Similar