NA LG ST COMP-PH MLMay 15, 2022

Sobolev Acceleration and Statistical Optimality for Learning Elliptic Equations via Gradient Descent

Stanford

arXiv:2205.07331v39.211 citationsh-index: 70

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient and statistically optimal learning of elliptic equations for researchers in computational mathematics and machine learning, providing theoretical insights into training acceleration, though it is incremental as it builds on existing methods.

The paper tackles the statistical limits of gradient descent for solving inverse problems from noisy observations, specifically for learning elliptic PDEs using methods like Sobolev training, Deep Ritz Methods, and Physics Informed Neural Networks, proving that these methods achieve statistical optimality with optimal epoch numbers increasing with sample size and task hardness.

In this paper, we study the statistical limits in terms of Sobolev norms of gradient descent for solving inverse problem from randomly sampled noisy observations using a general class of objective functions. Our class of objective functions includes Sobolev training for kernel regression, Deep Ritz Methods (DRM), and Physics Informed Neural Networks (PINN) for solving elliptic partial differential equations (PDEs) as special cases. We consider a potentially infinite-dimensional parameterization of our model using a suitable Reproducing Kernel Hilbert Space and a continuous parameterization of problem hardness through the definition of kernel integral operators. We prove that gradient descent over this objective function can also achieve statistical optimality and the optimal number of passes over the data increases with sample size. Based on our theory, we explain an implicit acceleration of using a Sobolev norm as the objective function for training, inferring that the optimal number of epochs of DRM becomes larger than the number of PINN when both the data size and the hardness of tasks increase, although both DRM and PINN can achieve statistical optimality.

View on arXiv PDF

Similar