Yuanchao Xu

LG
h-index26
6papers
17citations
Novelty48%
AI Score46

6 Papers

LGDec 21, 2025
Generative Modeling through Spectral Analysis of Koopman Operator

Yuanchao Xu, Fengyi Li, Masahiro Fujisawa et al.

We propose Koopman Spectral Wasserstein Gradient Descent (KSWGD), a generative modeling framework that combines operator-theoretic spectral analysis with optimal transport. The novel insight is that the spectral structure required for accelerated Wasserstein gradient descent can be directly estimated from trajectory data via Koopman operator approximation which can eliminate the need for explicit knowledge of the target potential or neural network training. We provide rigorous convergence analysis and establish connection to Feynman-Kac theory that clarifies the method's probabilistic foundation. Experiments across diverse settings, including compact manifold sampling, metastable multi-well systems, image generation, and high dimensional stochastic partial differential equation, demonstrate that KSWGD consistently achieves faster convergence than other existing methods while maintaining high sample quality.

LGSep 30, 2025Code
TASP: Topology-aware Sequence Parallelism

Yida Wang, Ke Hong, Xiuhong Li et al.

Long-context large language models (LLMs) face constraints due to the quadratic complexity of the self-attention mechanism. The mainstream sequence parallelism (SP) method, Ring Attention, attempts to solve this by distributing the query into multiple query chunks across accelerators and enable each Q tensor to access all KV tensors from other accelerators via the Ring AllGather communication primitive. However, it exhibits low communication efficiency, restricting its practical applicability. This inefficiency stems from the mismatch between the Ring AllGather communication primitive it adopts and the AlltoAll topology of modern accelerators. A Ring AllGather primitive is composed of iterations of ring-styled data transfer, which can only utilize a very limited fraction of an AlltoAll topology. Inspired by the Hamiltonian decomposition of complete directed graphs, we identify that modern accelerator topology can be decomposed into multiple orthogonal ring datapaths which can concurrently transfer data without interference. Based on this, we further observe that the Ring AllGather primitive can also be decomposed into the same number of concurrent ring-styled data transfer at every iteration. Based on these insights, we propose TASP, a topology-aware SP method for long-context LLMs that fully utilizes the communication capacity of modern accelerators via topology decomposition and primitive decomposition. Experimental results on both single-node and multi-node NVIDIA H100 systems and a single-node AMD MI300X system demonstrate that TASP achieves higher communication efficiency than Ring Attention on these modern accelerator topologies and achieves up to 3.58 speedup than Ring Attention and its variant Zigzag-Ring Attention. The code is available at https://github.com/infinigence/HamiltonAttention.

LGJan 1, 2025
ResKoopNet: Learning Koopman Representations for Complex Dynamics with Spectral Residuals

Yuanchao Xu, Kaidi Shao, Nikos Logothetis et al.

Analyzing the long-term behavior of high-dimensional nonlinear dynamical systems remains a significant challenge. While the Koopman operator framework provides a powerful global linearization tool, current methods for approximating its spectral components often face theoretical limitations and depend on predefined dictionaries. Residual Dynamic Mode Decomposition (ResDMD) advanced the field by introducing the \emph{spectral residual} to assess Koopman operator approximation accuracy; however, its approach of only filtering precomputed spectra prevents the discovery of the operator's complete spectral information, a limitation known as the `spectral inclusion' problem. We introduce ResKoopNet (Residual-based Koopman-learning Network), a novel method that directly addresses this by explicitly minimizing the \emph{spectral residual} to compute Koopman eigenpairs. This enables the identification of a more precise and complete Koopman operator spectrum. Using neural networks, our approach provides theoretical guarantees while maintaining computational adaptability. Experiments on a variety of physical and biological systems show that ResKoopNet achieves more accurate spectral approximations than existing methods, particularly for high-dimensional systems and those with continuous spectra, which demonstrates its effectiveness as a tool for analyzing complex dynamical systems.

LGFeb 2
How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?

Xiaoyuan Cheng, Wenxuan Yuan, Boyang Li et al.

Diffusion policy sampling enables reinforcement learning (RL) to represent multimodal action distributions beyond suboptimal unimodal Gaussian policies. However, existing diffusion-based RL methods primarily focus on offline settings for reward maximization, with limited consideration of safety in online settings. To address this gap, we propose Augmented Lagrangian-Guided Diffusion (ALGD), a novel algorithm for off-policy safe RL. By revisiting optimization theory and energy-based model, we show that the instability of primal-dual methods arises from the non-convex Lagrangian landscape. In diffusion-based safe RL, the Lagrangian can be interpreted as an energy function guiding the denoising dynamics. Counterintuitively, direct usage destabilizes both policy generation and training. ALGD resolves this issue by introducing an augmented Lagrangian that locally convexifies the energy landscape, yielding a stabilized policy generation and training process without altering the distribution of the optimal policy. Theoretical analysis and extensive experiments demonstrate that ALGD is both theoretically grounded and empirically effective, achieving strong and stable performance across diverse environments.

ROOct 12, 2021
Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Hsin-Hsuan Sung, Yuanchao Xu, Jiexiong Guan et al.

Autonomous driving is of great interest in both research and industry. The high cost has been one of the major roadblocks that slow down the development and adoption of autonomous driving in practice. This paper, for the first-time, shows that it is possible to run level-4 (i.e., fully autonomous driving) software on a single off-the-shelf card (Jetson AGX Xavier) for less than $1k, an order of magnitude less than the state-of-the-art systems, while meeting all the requirements of latency. The success comes from the resolution of some important issues shared by existing practices through a series of measures and innovations. The study overturns the common perceptions of the computing resources required by level-4 autonomous driving, points out a promising path for the industry to lower the cost, and suggests a number of research opportunities for rethinking the architecture, software design, and optimizations of autonomous driving.

LGJul 16, 2021
Decentralized Multi-Agent Reinforcement Learning for Task Offloading Under Uncertainty

Yuanchao Xu, Amal Feriani, Ekram Hossain

Multi-Agent Reinforcement Learning (MARL) is a challenging subarea of Reinforcement Learning due to the non-stationarity of the environments and the large dimensionality of the combined action space. Deep MARL algorithms have been applied to solve different task offloading problems. However, in real-world applications, information required by the agents (i.e. rewards and states) are subject to noise and alterations. The stability and the robustness of deep MARL to practical challenges is still an open research problem. In this work, we apply state-of-the art MARL algorithms to solve task offloading with reward uncertainty. We show that perturbations in the reward signal can induce decrease in the performance compared to learning with perfect rewards. We expect this paper to stimulate more research in studying and addressing the practical challenges of deploying deep MARL solutions in wireless communications systems.