NAMar 12
Convergence Analysis of Block Newton Methods for 1D Shallow Neural Network ApproximationZhiqiang Cai, Anastassia Doktorova, Robert D. Falgout et al.
This paper analyzes local convergence of the block Newton (BN) method introduced in [5, 6] for one-dimensional shallow neural network approximation to functions and diffusion-reaction problems. The BN method consists of the 2x2 block nonlinear Gauss-Seidel, linear Gauss-Seidel, or Jacobi method for outer iteration and the Newton method for inner iteration. The blocks are corresponding to the linear and the nonlinear parameters. Under some reasonable assumptions, we establish local convergence of the BN methods as well as the reduced BN (rBN) method for one-dimensional diffusion-reaction problems and least-squares function approximation. Unlike common optimization methods, the rBN allows for the reduction of the number of parameters during the optimization process when some neurons contribute little to the approximation or are at nearly optimal locations.
NAMar 25
Stable corrections for perturbed diagonally implicit Runge--Kutta methodsJohn Driscoll, Sigal Gottlieb, Zachary J. Grant et al.
A mixed accuracy framework for Runge--Kutta methods presented in Grant [JSC 2022] and applied to diagonally implicit Runge--Kutta (DIRK) methods can significantly speed up the computation by replacing the implicit solver by less expensive low accuracy approaches such as lower precision computation of the implicit solve, under-resolved iterative solvers, or simpler, less accurate models for the implicit stages. Understanding the effect of the perturbation errors introduced by the low accuracy computations enables the design of stable and accurate mixed accuracy DIRK methods where the errors from the low-accuracy computation are damped out by multiplication by \dt at multiple points in the simulation, resulting in a more accurate simulation than if low-accuracy was used for all computation. To improve upon this, explicit corrections were previously proposed and analyzed for accuracy, and their performance was tested in related work. Explicit corrections work well when the time-step is sufficiently small, but may introduce instabilities when the time-step is larger. In this work, the stability of the mixed accuracy approach is carefully studied, and used to design novel stabilized correction approaches.