Beibei Zhu

LG
h-index8
7papers
80citations
Novelty52%
AI Score42

7 Papers

NAJun 3
A novel class of high-order uniformly accurate exponential integrators with local linear extension for the charged-particle dynamics under strong magnetic field

Lina Wang, Bin Wang, Beibei Zhu

In this paper, we develop a novel class of high-order uniformly accurate exponential integrators for charged-particle dynamics under a strong magnetic field. The small parameter $0<\varepsilon\ll 1$ induces rapid temporal oscillations, rendering traditional numerical methods prohibitively expensive due to severe step-size restrictions. To address this issue, a linearization technology that introduces auxiliary polynomial variables is employed to recast the original charged-particle dynamics as a higher-dimensional system. Classical exponential integrators are subsequently applied to this augmented formulation, which inherently carries richer structural information, thereby yielding a family of uniformly accurate exponential integrators that can reach arbitrarily high order without requiring any order conditions. For the maximal ordering scaling strong magnetic field, we rigorously demonstrate via algebraic techniques that the proposed schemes with auxiliary polynomial variables of degree $k(k\geq 2)$ achieve an $\mathcal{O}(\varepsilon h^{k+1})$ improved error estimate for the position and a uniform $\mathcal{O}(h^{k+1})$ error estimate for the velocity. Numerical experiments validate the advantages of the methods. The theoretical and numerical in vestigation is finally extended to relativistic charged-particle dynamics in a four-dimensional framework with maximal ordering scaling strong magnetic field.

NAMar 31, 2023
Implementation and (Inverse Modified) Error Analysis for implicitly-templated ODE-nets

Aiqing Zhu, Tom Bertalan, Beibei Zhu et al.

We focus on learning unknown dynamics from data using ODE-nets templated on implicit numerical initial value problem solvers. First, we perform Inverse Modified error analysis of the ODE-nets using unrolled implicit schemes for ease of interpretation. It is shown that training an ODE-net using an unrolled implicit scheme returns a close approximation of an Inverse Modified Differential Equation (IMDE). In addition, we establish a theoretical basis for hyper-parameter selection when training such ODE-nets, whereas current strategies usually treat numerical integration of ODE-nets as a black box. We thus formulate an adaptive algorithm which monitors the level of error and adapts the number of (unrolled) implicit solution iterations during the training process, so that the error of the unrolled approximation is less than the current learning loss. This helps accelerate training, while maintaining accuracy. Several numerical experiments are performed to demonstrate the advantages of the proposed algorithm compared to nonadaptive unrollings, and validate the theoretical analysis. We also note that this approach naturally allows for incorporating partially known physical terms in the equations, giving rise to what is termed ``gray box" identification.

LGJun 15, 2022
On Numerical Integration in Neural Ordinary Differential Equations

Aiqing Zhu, Pengzhan Jin, Beibei Zhu et al.

The combination of ordinary differential equations and neural networks, i.e., neural ordinary differential equations (Neural ODE), has been widely studied from various angles. However, deciphering the numerical integration in Neural ODE is still an open challenge, as many researches demonstrated that numerical integration significantly affects the performance of the model. In this paper, we propose the inverse modified differential equations (IMDE) to clarify the influence of numerical integration on training Neural ODE models. IMDE is determined by the learning task and the employed ODE solver. It is shown that training a Neural ODE model actually returns a close approximation of the IMDE, rather than the true ODE. With the help of IMDE, we deduce that (i) the discrepancy between the learned model and the true ODE is bounded by the sum of discretization error and learning loss; (ii) Neural ODE using non-symplectic numerical integration fail to learn conservation laws theoretically. Several experiments are performed to numerically verify our theoretical analysis.

NAMar 15, 2018
A stroboscopic averaging algorithm for highly oscillatory delay problems

J. M. Sanz-Serna, Beibei Zhu

We propose and analyze a heterogenous multiscale method for the efficient integration of constant-delay differential equations subject to fast periodic forcing. The stroboscopic averaging method (SAM) suggested here may provide approximations with $\(\mathcal{O}(H^2+1/Ω^2)\)$ errors with a computational effort that grows like $\(H^{-1}\)$ (the inverse of the stepsize), uniformly in the forcing frequency Omega.

LGApr 29, 2022
VPNets: Volume-preserving neural networks for learning source-free dynamics

Aiqing Zhu, Beibei Zhu, Jiawei Zhang et al.

We propose volume-preserving networks (VPNets) for learning unknown source-free dynamical systems using trajectory data. We propose three modules and combine them to obtain two network architectures, coined R-VPNet and LA-VPNet. The distinct feature of the proposed models is that they are intrinsic volume-preserving. In addition, the corresponding approximation theorems are proved, which theoretically guarantee the expressivity of the proposed VPNets to learn source-free dynamics. The effectiveness, generalization ability and structure-preserving property of the VP-Nets are demonstrated by numerical experiments.

LGMay 29, 2025
LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Training

Bo Wu, Sid Wang, Yunhao Tang et al.

Reinforcement Learning (RL) has become the most effective post-training approach for improving the capabilities of Large Language Models (LLMs). In practice, because of the high demands on latency and memory, it is particularly challenging to develop an efficient RL framework that reliably manages policy models with hundreds to thousands of billions of parameters. In this paper, we present LlamaRL, a fully distributed, asynchronous RL framework optimized for efficient training of large-scale LLMs with various model sizes (8B, 70B, and 405B parameters) on GPU clusters ranging from a handful to thousands of devices. LlamaRL introduces a streamlined, single-controller architecture built entirely on native PyTorch, enabling modularity, ease of use, and seamless scalability to thousands of GPUs. We also provide a theoretical analysis of LlamaRL's efficiency, including a formal proof that its asynchronous design leads to strict RL speed-up. Empirically during the Llama 3 post-training, by leveraging best practices such as colocated model offloading, asynchronous off-policy training, and distributed direct memory access for weight synchronization, LlamaRL achieves significant efficiency gains -- up to 10.7x speed-up compared to DeepSpeed-Chat-like systems on a 405B-parameter policy model. Furthermore, the efficiency advantage continues to grow with increasing model scale, demonstrating the framework's suitability for future large-scale RL training.