OC LGMay 14

A Non-Monotone Preconditioned Trust-Region Method for Neural Network Training

Andrea Angino, Bindi Çapriqi, Shega Likaj, Ken Trotti, Rolf Krause

arXiv:2605.148605.6

AI Analysis

For practitioners training large-scale neural networks, this method improves efficiency of parallel domain decomposition training by reducing computational overhead.

The paper introduces a non-monotone variant of the Additively Preconditioned Trust-Region Strategy (APTS) for neural network training, which reduces CPU time by 30% and cuts rejected steps to one third compared to APTS while preserving accuracy.

Training deep neural networks at scale can benefit from domain decomposition, where the network is split into subdomains trained in parallel and coupled by a global trust-region mechanism. Building on the Additively Preconditioned Trust-Region Strategy (APTS), we propose a non-monotone variant with a nonlinear additive Schwarz preconditioner that combines parallel subdomain corrections with global coarse-space directions. A windowed acceptance criterion allows controlled objective increases, avoiding needless rejection of effective coarse steps. The resulting non-monotone APTS (NAPTS) preserves accuracy while reducing CPU time by 30\% and cutting rejected steps to one third of those in APTS.

View on arXiv PDF

Similar