Distributed Multigrid Neural Solvers on Megavoxel Domains
This work addresses the problem of efficient neural PDE solving for high-resolution 3D domains, offering a scalable solution for computational science applications, though it is incremental as it builds on existing multigrid and distributed training techniques.
The paper tackles the challenge of training large-scale neural networks as PDE solvers for the generalized 3D Poisson equation on megavoxel domains, achieving scalability to predict full-field solutions at resolutions up to 512x512x512 using a distributed multigrid framework.
We consider the distributed training of large-scale neural networks that serve as PDE solvers producing full field outputs. We specifically consider neural solvers for the generalized 3D Poisson equation over megavoxel domains. A scalable framework is presented that integrates two distinct advances. First, we accelerate training a large model via a method analogous to the multigrid technique used in numerical linear algebra. Here, the network is trained using a hierarchy of increasing resolution inputs in sequence, analogous to the 'V', 'W', 'F', and 'Half-V' cycles used in multigrid approaches. In conjunction with the multi-grid approach, we implement a distributed deep learning framework which significantly reduces the time to solve. We show the scalability of this approach on both GPU (Azure VMs on Cloud) and CPU clusters (PSC Bridges2). This approach is deployed to train a generalized 3D Poisson solver that scales well to predict output full-field solutions up to the resolution of 512x512x512 for a high dimensional family of inputs.