LGAIAug 18, 2023

Tensor-Compressed Back-Propagation-Free Training for (Physics-Informed) Neural Networks

arXiv:2308.09858v216 citationsh-index: 46
Originality Incremental advance
AI Analysis

This enables on-device training for resource-constrained platforms like FPGAs and micro-controllers, though it is incremental as it builds on zeroth-order optimization methods.

The paper tackles the challenge of implementing back-propagation on edge devices by proposing a completely BP-free training framework using only forward propagation, achieving minimal accuracy loss on MNIST and successfully training a PINN for a 20-dimensional PDE.

Backward propagation (BP) is widely used to compute the gradients in neural network training. However, it is hard to implement BP on edge devices due to the lack of hardware and software resources to support automatic differentiation. This has tremendously increased the design complexity and time-to-market of on-device training accelerators. This paper presents a completely BP-free framework that only requires forward propagation to train realistic neural networks. Our technical contributions are three-fold. Firstly, we present a tensor-compressed variance reduction approach to greatly improve the scalability of zeroth-order (ZO) optimization, making it feasible to handle a network size that is beyond the capability of previous ZO approaches. Secondly, we present a hybrid gradient evaluation approach to improve the efficiency of ZO training. Finally, we extend our BP-free training framework to physics-informed neural networks (PINNs) by proposing a sparse-grid approach to estimate the derivatives in the loss function without using BP. Our BP-free training only loses little accuracy on the MNIST dataset compared with standard first-order training. We also demonstrate successful results in training a PINN for solving a 20-dim Hamiltonian-Jacobi-Bellman PDE. This memory-efficient and BP-free approach may serve as a foundation for the near-future on-device training on many resource-constraint platforms (e.g., FPGA, ASIC, micro-controllers, and photonic chips).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes