Mauricio Tano

8.5LGJun 18, 2020

Accelerating Training in Artificial Neural Networks with Dynamic Mode Decomposition

Mauricio E. Tano, Gavin D. Portwood, Jean C. Ragusa

Training of deep neural networks (DNNs) frequently involves optimizing several millions or even billions of parameters. Even with modern computing architectures, the computational expense of DNN training can inhibit, for instance, network architecture design optimization, hyper-parameter studies, and integration into scientific research cycles. The key factor limiting performance is that both the feed-forward evaluation and the back-propagation rule are needed for each weight during optimization in the update rule. In this work, we propose a method to decouple the evaluation of the update rule at each weight. At first, Proper Orthogonal Decomposition (POD) is used to identify a current estimate of the principal directions of evolution of weights per layer during training based on the evolution observed with a few backpropagation steps. Then, Dynamic Mode Decomposition (DMD) is used to learn the dynamics of the evolution of the weights in each layer according to these principal directions. The DMD model is used to evaluate an approximate converged state when training the ANN. Afterward, some number of backpropagation steps are performed, starting from the DMD estimates, leading to an update to the principal directions and DMD model. This iterative process is repeated until convergence. By fine-tuning the number of backpropagation steps used for each DMD model estimation, a significant reduction in the number of operations required to train the neural networks can be achieved. In this paper, the DMD acceleration method will be explained in detail, along with the theoretical justification for the acceleration provided by DMD. This method is illustrated using a regression problem of key interest for the scientific machine learning community: the prediction of a pollutant concentration field in a diffusion, advection, reaction problem.

2.3COMP-PHJun 6, 2019

Acceleration of Radiation Transport Solves Using Artificial Neural Networks

Mauricio Tano, Jean Ragusa

Discontinuous Finite Element Methods (DFEM) have been widely used for solving $S_n$ radiation transport problems in participative and non-participative media. In the DFEM $S_n$ methodology, the transport equation is discretized into a set of algebraic equations that have to be solved for each spatial cell and angular direction, strictly preserving the following of radiation in the system. At the core of a DFEM solver a small matrix-vector system (of 8 independent equations for tri-linear DFEM in 3D hexehdral cells) has to be assembled and solved for each cell, angle, energy group, and time step. These systems are generally solved by direct Gaussian Elimination. The computational cost of the Gaussian Elimination, repeated for each phase-space cell, amounts to a large fraction to the total compute time. Here, we have designed a Machine Learning algorithm based in a shallow Artificial Neural Networks (ANNs) to replace that Gaussian Elimination step, enabling a sizeable speed up in the solution process. The key idea is to train an ANN with a large set of solutions of random one-cell transport problems and then to use the trained ANN to replace Gaussian Elimination large scale transport solvers. It has been observed that ANNs decrease the solution times by at least a factor of 4, while introducing mean absolute errors between 1-3 \% in large scale transport solutions.

Mauricio Tano

2 Papers