New optimization algorithms for neural network training using operator splitting techniques
This addresses optimization challenges in neural network training for machine learning practitioners, though it appears incremental as it builds on existing operator splitting methods.
The paper tackles neural network training optimization by developing new algorithms based on operator splitting techniques, achieving validated convergence on MNIST, Fashion-MNIST, and CIFAR-10 datasets with empirical rates toward local minima.
In the following paper we present a new type of optimization algorithms adapted for neural network training. These algorithms are based upon sequential operator splitting technique for some associated dynamical systems. Furthermore, we investigate through numerical simulations the empirical rate of convergence of these iterative schemes toward a local minimum of the loss function, with some suitable choices of the underlying hyper-parameters. We validate the convergence of these optimizers using the results of the accuracy and of the loss function on the MNIST, MNIST-Fashion and CIFAR 10 classification datasets.