LGJan 30, 2024

Speeding up and reducing memory usage for scientific machine learning via mixed precision

Joel Hayford, Jacob Goldman-Wetzler, Eric Wang, Lu Lu

arXiv:2401.16645v113.419 citationsh-index: 1Has CodeComput Method Appl Mech Eng

Originality Synthesis-oriented

AI Analysis

This work addresses computational efficiency for researchers and practitioners in scientific machine learning, though it is incremental as it applies an existing mixed precision technique to a specific domain.

The authors tackled the problem of high computational and memory costs in training physics-informed neural networks (PINNs) and deep operator networks (DeepONets) for scientific machine learning by exploring mixed precision training, which reduced training times and memory usage while maintaining model accuracy.

Scientific machine learning (SciML) has emerged as a versatile approach to address complex computational science and engineering problems. Within this field, physics-informed neural networks (PINNs) and deep operator networks (DeepONets) stand out as the leading techniques for solving partial differential equations by incorporating both physical equations and experimental data. However, training PINNs and DeepONets requires significant computational resources, including long computational times and large amounts of memory. In search of computational efficiency, training neural networks using half precision (float16) rather than the conventional single (float32) or double (float64) precision has gained substantial interest, given the inherent benefits of reduced computational time and memory consumed. However, we find that float16 cannot be applied to SciML methods, because of gradient divergence at the start of training, weight updates going to zero, and the inability to converge to a local minima. To overcome these limitations, we explore mixed precision, which is an approach that combines the float16 and float32 numerical formats to reduce memory usage and increase computational speed. Our experiments showcase that mixed precision training not only substantially decreases training times and memory demands but also maintains model accuracy. We also reinforce our empirical observations with a theoretical analysis. The research has broad implications for SciML in various computational applications.

View on arXiv PDF Code

Similar