Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint
This addresses training inefficiencies in deep neural networks for scientific applications like solving partial differential equations, though it appears incremental as it builds on existing methods.
The authors tackled the gap between theoretical optimal approximation rates of deep neural networks and practical accuracy by developing novel initializations and a hybrid optimizer based on an adaptive basis viewpoint, resulting in dramatic increases in accuracy and convergence rate for regression and physics-informed neural network benchmarks.
Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.