Juntao Huang

NA
h-index2
8papers
103citations
Novelty49%
AI Score44

8 Papers

CHEM-PHJan 17, 2021Code
Data-driven discovery of multiscale chemical reactions governed by the law of mass action

Juntao Huang, Yizhou Zhou, Wen-An Yong

In this paper, we propose a data-driven method to discover multiscale chemical reactions governed by the law of mass action. First, we use a single matrix to represent the stoichiometric coefficients for both the reactants and products in a system without catalysis reactions. The negative entries in the matrix denote the stoichiometric coefficients for the reactants and the positive ones for the products. Second, we find that the conventional optimization methods usually get stuck in the local minima and could not find the true solution in learning the multiscale chemical reactions. To overcome this difficulty, we propose a partial-parameters-freezing (PPF) technique to progressively determine the network parameters by using the fact that the stoichiometric coefficients are integers. With such a technique, the dimension of the searching space is gradually reduced in the training process and the global mimina can be eventually obtained. Several numerical experiments including the classical Michaelis-Menten kinetics, the hydrogen oxidation reactions, and the simplified GRI-3.0 mechanism verify the good performance of our algorithm in learning the multiscale chemical reactions. The code is available at \url{https://github.com/JuntaoHuang/multiscale-chemical-reaction}.

68.1CLMay 1
AGoQ: Activation and Gradient Quantization for Memory-Efficient Distributed Training of LLMs

Wenxiang Lin, Juntao Huang, Luhan Zhang et al.

Quantization is a key method for reducing the GPU memory requirement of training large language models (LLMs). Yet, current approaches are ineffective for 4-bit activations and 8-bit gradients, which would easily cause slow convergence or accuracy loss. To address this, we introduce AGoQ, incorporating two new techniques: 1) a layer-aware activation quantization algorithm that allocates appropriate bit-widths for activations of various layers based on their types and pipeline stages to achieve near 4-bit activation storage, and 2) a gradient quantization algorithm that reduces memory usage and shortens communication time by employing 8-bit gradient storage and precision-preserving 8-bit All-Reduce communication. We conduct extensive experiments using different sizes of LLMs on two GPU clusters (up to 64 GPUs), and the experimental results show that our AGoQ reduces the memory by up to 52\% and achieves up to 1.34$\times$ improvement of training speed compared to state-of-the-art training systems Megatron-LM (w/ or w/o ZeRO), COAT and DeepSpeed with 8B to 32B LLaMA models, while achieving convergence loss on pretraining and comparable accuracy on downstream tasks with LLaMA architectures.

19.6NAApr 22
Machine learning moment closure models for the radiative transfer equation IV: enforcing symmetrizable hyperbolicity in two dimensions

Juntao Huang

This is our fourth work in the series on machine learning (ML) moment closure models for the radiative transfer equation (RTE). In the first three papers of this series, we considered the RTE in slab geometry in 1D1V (i.e. one dimension in physical space and one dimension in angular space), and introduced a gradient-based ML moment closure [1], then enforced the hyperbolicity through a symmetrizer [2], or together with physical characteristic speeds by learning the eigenvalues of the Jacobian matrix [3]. Here, we extend our framework to the RTE in 2D2V (i.e. two dimensions in physical space and two dimensions in angular space). The main idea is to preserve the leading part of the classical $P_N$ model and modify only the highest-order block row. By analyzing the structural properties of the $P_N$ model, we show that its coefficient matrices are symmetric and admit a block-tridiagonal structure. Then we use this property to introduce a block-diagonal symmetrizer for the ML moment model and derive explicit algebraic conditions on the closure blocks which guarantee the symmetrizable hyperbolicity of the resulting ML system. These conditions lead to a natural parametrization of the closure in terms of a symmetric positive definite matrix together with symmetric closure blocks, which can be learned from data while automatically enforcing symmetrizable hyperbolicity by construction. The numerical results show that the proposed framework improves upon the classical $P_N$ model while maintaining hyperbolicity.

NAJan 9, 2024
Hyperbolic Machine Learning Moment Closures for the BGK Equations

Andrew J. Christlieb, Mingchang Ding, Juntao Huang et al.

We introduce a hyperbolic closure for the Grad moment expansion of the Bhatnagar-Gross-Krook's (BGK) kinetic model using a neural network (NN) trained on BGK's moment data. This closure is motivated by the exact closure for the free streaming limit that we derived in our paper on closures in transport \cite{Huang2022-RTE1}. The exact closure relates the gradient of the highest moment to the gradient of four lower moments. As with our past work, the model presented here learns the gradient of the highest moment in terms of the coefficients of gradients for all lower ones. By necessity, this means that the resulting hyperbolic system is not conservative in the highest moment. For stability, the output layers of the NN are designed to enforce hyperbolicity and Galilean invariance. This ensures the model can be run outside of the training window of the NN. Unlike our previous work on radiation transport that dealt with linear models, the BGK model's nonlinearity demanded advanced training tools. These comprised an optimal learning rate discovery, one cycle training, batch normalization in each neural layer, and the use of the \texttt{AdamW} optimizer. To address the non-conservative structure of the hyperbolic model, we adopt the FORCE numerical method to achieve robust solutions. This results in a comprehensive computing model combining learned closures with methods for solving hyperbolic models. The proposed model can capture accurate moment solutions across a broad spectrum of Knudsen numbers. Our paper details the multi-scale model construction and is run on a range of test problems.

NASep 2, 2021
Machine learning moment closure models for the radiative transfer equation III: enforcing hyperbolicity and physical characteristic speeds

Juntao Huang, Yingda Cheng, Andrew J. Christlieb et al.

This is the third paper in a series in which we develop machine learning (ML) moment closure models for the radiative transfer equation (RTE). In our previous work \cite{huang2021gradient}, we proposed an approach to learn the gradient of the unclosed high order moment, which performs much better than learning the moment itself and the conventional $P_N$ closure. However, while the ML moment closure has better accuracy, it is not able to guarantee hyperbolicity and has issues with long time stability. In our second paper \cite{huang2021hyperbolic}, we identified a symmetrizer which leads to conditions that enforce that the gradient based ML closure is symmetrizable hyperbolic and stable over long time. The limitation of this approach is that in practice the highest moment can only be related to four, or fewer, lower moments. In this paper, we propose a new method to enforce the hyperbolicity of the ML closure model. Motivated by the observation that the coefficient matrix of the closure system is a lower Hessenberg matrix, we relate its eigenvalues to the roots of an associated polynomial. We design two new neural network architectures based on this relation. The ML closure model resulting from the first neural network is weakly hyperbolic and guarantees the physical characteristic speeds, i.e., the eigenvalues are bounded by the speed of light. The second model is strictly hyperbolic and does not guarantee the boundedness of the eigenvalues. Several benchmark tests including the Gaussian source problem and the two-material problem show the good accuracy, stability and generalizability of our hyperbolic ML closure model.

NAMay 30, 2021
Machine learning moment closure models for the radiative transfer equation II: enforcing global hyperbolicity in gradient based closures

Juntao Huang, Yingda Cheng, Andrew J. Christlieb et al.

This is the second paper in a series in which we develop machine learning (ML) moment closure models for the radiative transfer equation (RTE). In our previous work \cite{huang2021gradient}, we proposed an approach to directly learn the gradient of the unclosed high order moment, which performs much better than learning the moment itself and the conventional $P_N$ closure. However, the ML moment closure model in \cite{huang2021gradient} is not able to guarantee hyperbolicity and long time stability. We propose in this paper a method to enforce the global hyperbolicity of the ML closure model. The main idea is to seek a symmetrizer (a symmetric positive definite matrix) for the closure system, and derive constraints such that the system is globally symmetrizable hyperbolic. It is shown that the new ML closure system inherits the dissipativeness of the RTE and preserves the correct diffusion limit as the Knunsden number goes to zero. Several benchmark tests including the Gaussian source problem and the two-material problem show the good accuracy, long time stability and generalizability of our globally hyperbolic ML closure model.

NAMay 12, 2021
Machine learning moment closure models for the radiative transfer equation I: directly learning a gradient based closure

Juntao Huang, Yingda Cheng, Andrew J. Christlieb et al.

In this paper, we take a data-driven approach and apply machine learning to the moment closure problem for radiative transfer equation in slab geometry. Instead of learning the unclosed high order moment, we propose to directly learn the gradient of the high order moment using neural networks. This new approach is consistent with the exact closure we derive for the free streaming limit and also provides a natural output normalization. A variety of benchmark tests, including the variable scattering problem, the Gaussian source problem with both periodic and reflecting boundaries, and the two-material problem, show both good accuracy and generalizability of our machine learning closure model.

COMP-PHSep 28, 2020
Learning Thermodynamically Stable and Galilean Invariant Partial Differential Equations for Non-equilibrium Flows

Juntao Huang, Zhiting Ma, Yizhou Zhou et al.

In this work, we develop a method for learning interpretable, thermodynamically stable and Galilean invariant partial differential equations (PDEs) based on the Conservation-dissipation Formalism of irreversible thermodynamics. As governing equations for non-equilibrium flows in one dimension, the learned PDEs are parameterized by fully-connected neural networks and satisfy the conservation-dissipation principle automatically. In particular, they are hyperbolic balance laws and Galilean invariant. The training data are generated from a kinetic model with smooth initial data. Numerical results indicate that the learned PDEs can achieve good accuracy in a wide range of Knudsen numbers. Remarkably, the learned dynamics can give satisfactory results with randomly sampled discontinuous initial data and Sod's shock tube problem although it is trained only with smooth initial data.