AIAug 30, 2023Code
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMOYangkun Chen, Joseph Suarez, Junjie Zhang et al.
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponents not seen during training. The competition combines relatively complex environment design with large numbers of agents in the environment. The top submissions demonstrate strong success on this task using mostly standard reinforcement learning (RL) methods combined with domain-specific engineering. We summarize the competition design and results and suggest that, as an academic community, competitions may be a powerful approach to solving hard problems and establishing a solid benchmark for algorithms. We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
AINov 7, 2023Code
The NeurIPS 2022 Neural MMO Challenge: A Massively Multiagent Competition with Specialization and TradeEnhong Liu, Joseph Suarez, Chenhui You et al.
In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1,600 submissions. Like the previous IJCAI-2022 Neural MMO Challenge, it involved agents from 16 populations surviving in procedurally generated worlds by collecting resources and defeating opponents. This year's competition runs on the latest v1.6 Neural MMO, which introduces new equipment, combat, trading, and a better scoring system. These elements combine to pose additional robustness and generalization challenges not present in previous competitions. This paper summarizes the design and results of the challenge, explores the potential of this environment as a benchmark for learning methods, and presents some practical reinforcement learning training approaches for complex tasks with sparse rewards. Additionally, we have open-sourced our baselines, including environment wrappers, benchmarks, and visualization tools for future research.
42.2NAMay 8
Numerical Homogenization of Landau-Lifshitz Equation with Rough CoefficientsZetao Ma, Jingrun Chen, Rui Du et al.
In this work, we develop a numerical homogenization approach for the fully nonlinear Landau-Lifshitz equation with rough coefficients, including non-periodicity and nonseparable scales. Direct numerical resolution of such multiscale problems on fine meshes incurs prohibitive computational costs. To address this challenge, we propose an efficient coarse scale approximation through localized basis functions derived from energy minimization within the Generalized Rough Polyharmonic Splines (GRPS) framework. These basis functions preserve critical multiscale features while operating on a computationally tractable coarse mesh. The nonlinear, vectorial, and non-symmetric nature of the Landau-Lifshitz equation necessitates careful design of variational formulations for basis construction. We introduce several such formulations, each tailored to specific structural aspects of the problem. Through systematic numerical experiments, we demonstrate that our approach achieves significant computational savings without compromising accuracy, offering a robust framework for simulating multiscale magnetic systems with complex microstructures.
17.8CLMar 24
The Diminishing Returns of Early-Exit Decoding in Modern LLMsRui Wei, Rui Du, Hanfei Yu et al.
In Large Language Model (LLM) inference, early-exit refers to stopping computation at an intermediate layer once the prediction is sufficiently confident, thereby reducing latency and cost. However, recent LLMs adopt improved pretraining recipes and architectures that reduce layer redundancy, potentially limiting early-exit opportunities. We re-evaluate layer-wise early-exit in modern LLMs and analyze how intermediate representations evolve during training. We introduce a metric to quantify a model's intrinsic suitability for early-exit and propose a benchmark for researchers to explore the potential early-exit benefits on different models and workloads. Our results show a diminishing trend in early-exit effectiveness across newer model generations. We further find that dense transformers generally offer greater early-exit potential than Mixture-of-Experts and State Space Models. In addition, larger models, particularly those with more than 20 billion parameters, and base pretrained models without specialized tuning tend to exhibit higher early-exit potential.
54.1NAMay 15
Finite volume element method for Landau-Lifshitz equationYunjie Gong, Jingrun Chen, Rui Du et al.
The Landau-Lifshitz equation describes the dynamics of magnetization in ferromagnetic materials. Due to the essential nonlinearity and nonconvex constraint, it is typically solved numerically. In this paper, we developed a finite volume element method (FVEM) with the Gauss-Seidel projection method (GSPM) for the micromagnetics simulations. We provide the approximation error in space and depict the energy law when the FVEM is adopted. Owing to the GSPM for time-marching, the discrete system is decoupled component by component, making the computational complexity comparable to that of solving the scalar heat equation implicitly. This significantly accelerates real simulations. We present several numerical experiments to validate the theoretical analysis and the efficiency gain. Additionally, we study the blow-up solution and efficiently simulate the 2D magnetic textures using the proposed method.
NANov 5, 2019
Quasi-Monte Carlo sampling for machine-learning partial differential equationsJingrun Chen, Rui Du, Panchi Li et al.
Solving partial differential equations in high dimensions by deep neural network has brought significant attentions in recent years. In many scenarios, the loss function is defined as an integral over a high-dimensional domain. Monte-Carlo method, together with the deep neural network, is used to overcome the curse of dimensionality, while classical methods fail. Often, a deep neural network outperforms classical numerical methods in terms of both accuracy and efficiency. In this paper, we propose to use quasi-Monte Carlo sampling, instead of Monte-Carlo method to approximate the loss function. To demonstrate the idea, we conduct numerical experiments in the framework of deep Ritz method proposed by Weinan E and Bing Yu. For the same accuracy requirement, it is observed that quasi-Monte Carlo sampling reduces the size of training data set by more than two orders of magnitude compared to that of MC method. Under some assumptions, we prove that quasi-Monte Carlo sampling together with the deep neural network generates a convergent series with rate proportional to the approximation accuracy of quasi-Monte Carlo method for numerical integration. Numerically the fitted convergence rate is a bit smaller, but the proposed approach always outperforms Monte Carlo method. It is worth mentioning that the convergence analysis is generic whenever a loss function is approximated by the quasi-Monte Carlo method, although observations here are based on deep Ritz method.