Ziming Yu

h-index4

3papers

15citations

Novelty65%

AI Score48

Ranked #52,630 of 201,326 authors (top 26%)#11,979 in LG (top 28%)

3 Papers

LGFeb 25

Global River Forecasting with a Topology-Informed AI Foundation Model

Hancheng Ren, Gang Zhao, Shuo Wang et al.

River systems operate as inherently interconnected continuous networks, meaning river hydrodynamic simulation ought to be a systemic process. However, widespread hydrology data scarcity often restricts data-driven forecasting to isolated predictions. To achieve systemic simulation and reduce reliance on river observations, we present GraphRiverCast (GRC), a topology-informed AI foundation model designed to simulate multivariate river hydrodynamics in global river systems. GRC is capable of operating in a "ColdStart" mode, generating predictions without relying on historical river states for initialization. In 7-day global pseudo-hindcasts, GRC-ColdStart functions as a robust standalone simulator, achieving a Nash-Sutcliffe Efficiency (NSE) of approximately 0.82 without exhibiting the significant error accumulation typical of autoregressive paradigms. Ablation studies reveal that topological encoding serves as indispensable structural information in the absence of historical states, explicitly guiding hydraulic connectivity and network-scale mass redistribution to reconstruct flow dynamics. Furthermore, when adapted locally via a pre-training and fine-tuning strategy, GRC consistently outperforms physics-based and locally-trained AI baselines. Crucially, this superiority extends from gauged reaches to full river networks, underscoring the necessity of topology encoding and physics-based pre-training. Built on a physics-aligned neural operator architecture, GRC enables rapid and cross-scale adaptive simulation, establishing a collaborative paradigm bridging global hydrodynamic knowledge with local hydrological reality.

ITMar 25

A Measurement-Calibrated AI-Assisted Digital Twin for Terahertz Wireless Data Centers

Mingjie Zhu, Yejian Lyu, Ziming Yu et al.

Terahertz (THz) wireless communication has emerged as a promising solution for future data center interconnects; however, accurate channel characterization and system-level performance evaluation in complex indoor environments remain challenging. In this work, a measurement-calibrated AI-assisted digital twin (DT) framework is developed for THz wireless data centers by tightly integrating channel measurements, ray-tracing (RT), and implicit neural field (INF) modeling. Specifically, channel measurements are first conducted using a vector network analyzer at 300 GHz under both line-of-sight (LoS) and non-line-of-sight (NLoS) scenarios. RT simulations performed on the Sionna platform capture the dominant multipath structures and show good consistency with measured results. Building upon measurement and RT data, an RT-conditioned INF is developed to construct a continuous radio-frequency (RF) field representation, enabling accurate prediction in RT-missing NLoS regions. The comprehensive RF map generated by DT can provide system-level analysis and decisions for wireless data centers.

LGOct 11, 2024Code

Zeroth-Order Fine-Tuning of LLMs in Random Subspaces

Ziming Yu, Pan Zhou, Sike Wang et al.

Fine-tuning Large Language Models (LLMs) has proven effective for a variety of downstream tasks. However, as LLMs grow in size, the memory demands for backpropagation become increasingly prohibitive. Zeroth-order (ZO) optimization methods offer a memory-efficient alternative by using forward passes to estimate gradients, but the variance of gradient estimates typically scales linearly with the model's parameter dimension$\unicode{x2013}$a significant issue for LLMs. In this paper, we propose the random Subspace Zeroth-order (SubZero) optimization to address the challenges posed by LLMs' high dimensionality. We introduce a low-rank perturbation tailored for LLMs that significantly reduces memory consumption while improving training performance. Additionally, we prove that our gradient estimation closely approximates the backpropagation gradient, exhibits lower variance than traditional ZO methods, and ensures convergence when combined with SGD. Experimental results show that SubZero enhances fine-tuning performance and achieves faster convergence compared to standard ZO approaches like MeZO across various language modeling tasks. Code is available at https://github.com/zimingyy/SubZero.