LGDSOCDec 7, 2024

Convergence analysis of wide shallow neural operators within the framework of Neural Tangent Kernel

arXiv:2412.05545v31 citationsh-index: 2
Originality Incremental advance
AI Analysis

This provides theoretical guarantees for training neural operators in scientific computing, addressing a gap in understanding their optimization behavior, though it is incremental as it extends existing NTK analysis to this specific domain.

The paper tackles the lack of training error analysis for neural operators by conducting convergence analysis of gradient descent for wide shallow neural operators within the Neural Tangent Kernel framework, showing that over-parameterization and random initialization lead to linear convergence and the ability to find global minima in both continuous and discrete time.

Neural operators are aiming at approximating operators mapping between Banach spaces of functions, achieving much success in the field of scientific computing. Compared to certain deep learning-based solvers, such as Physics-Informed Neural Networks (PINNs), Deep Ritz Method (DRM), neural operators can solve a class of Partial Differential Equations (PDEs). Although much work has been done to analyze the approximation and generalization error of neural operators, there is still a lack of analysis on their training error. In this work, we conduct the convergence analysis of gradient descent for the wide shallow neural operators and physics-informed shallow neural operators within the framework of Neural Tangent Kernel (NTK). The core idea lies on the fact that over-parameterization and random initialization together ensure that each weight vector remains near its initialization throughout all iterations, yielding the linear convergence of gradient descent. In this work, we demonstrate that under the setting of over-parametrization, gradient descent can find the global minimum regardless of whether it is in continuous time or discrete time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes