Xiaoqun Wang

3papers

5citations

Novelty65%

AI Score41

Ranked #93,473 of 201,018 authors (top 46%)#281 in NA (top 53%)

3 Papers

NAOct 28, 2022

Convergence analysis of a quasi-Monte Carlo-based deep learning algorithm for solving partial differential equations

Fengjiang Fu, Xiaoqun Wang

Deep learning methods have achieved great success in solving partial differential equations (PDEs), where the loss is often defined as an integral. The accuracy and efficiency of these algorithms depend greatly on the quadrature method. We propose to apply quasi-Monte Carlo (QMC) methods to the Deep Ritz Method (DRM) for solving the Neumann problems for the Poisson equation and the static Schrödinger equation. For error estimation, we decompose the error of using the deep learning algorithm to solve PDEs into the generalization error, the approximation error and the training error. We establish the upper bounds and prove that QMC-based DRM achieves an asymptotically smaller error bound than DRM. Numerical experiments show that the proposed method converges faster in all cases and the variances of the gradient estimators of randomized QMC-based DRM are much smaller than those of DRM, which illustrates the superiority of QMC in deep learning over MC.

NAApr 3

Nested Multilevel Monte Carlo with Preintegration for Efficient Risk Estimation

Yu Xu, Xiaoqun Wang

Nested Monte Carlo is widely used for risk estimation, but its efficiency is limited by the discontinuity of the indicator function and high computational cost. This paper proposes a nested Multilevel Monte Carlo (MLMC) method combined with preintegration for efficient risk estimation. We first use preintegration to integrate out one outer random variable, which effectively handles the discontinuity of the indicator function, then we construct the MLMC estimator with preintegration to reduce the computational cost. Our theoretical analysis proves that the strong convergence rate of the MLMC combined with preintegration reaches -1, compared with -1/2 for the standard MLMC. Consequently, we obtain a nearly optimal computational complexity. Besides, our method can also handle the high-kurtosis phenomenon caused by indicator functions. Numerical experiments verify that the smoothed MLMC with preintegration outperforms the standard MLMC and the optimal computational cost can be attained. Combining our method with quasi-Monte Carlo further improves its performance in high dimensions. Keywords: Nested simulation, Multilevel Monte Carlo, Risk estimation, Preintegration

DCJun 12, 2024

ProTrain: Efficient LLM Training via Memory-Aware Techniques

Hanmei Yang, Jin Zhou, Yao Fu et al.

It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. Such a technique largely democratizes billion-scale model training, making it possible to train with few consumer graphics cards. However, based on our observation, existing frameworks often provide coarse-grained memory management and require experienced experts in configuration tuning, leading to suboptimal hardware utilization and performance. This paper proposes ProTrain, a novel training system that intelligently balances memory usage and performance by coordinating memory, computation, and IO. ProTrain achieves adaptive memory management through Chunk-Based Model State Management and Block-Wise Activation Management, guided by a Memory-Aware Runtime Profiler without user intervention. ProTrain does not change the training algorithm and thus does not compromise accuracy. Experiments show that ProTrain improves training throughput by 1.43$\times$ to 2.71$\times$ compared to the SOTA training systems.