Yiheng Chen

LG
h-index10
7papers
45citations
Novelty46%
AI Score51

7 Papers

CVApr 18
When Earth Foundation Models Meet Diffusion: An Application to Land Surface Temperature Super-Resolution

Yiheng Chen, Zihui Ma, Peishi Jiang et al.

Land surface temperature (LST) super-resolution is important for environmental monitoring. However, it remains challenging as coarse thermal observations severely underdetermine fine-scale structure. In this paper, we propose Earth Foundation Model-guided Diffusion (EFDiff), a novel framework for super-resolution under extreme spatial degradation. EFDiff uses the Prithvi-EO-2.0 Earth foundation model to encode high-resolution multispectral reflectance into geospatial embeddings, which are injected into the denoising network via cross-attention to guide fine-scale reconstruction from highly degraded observations. We study two variants, EFDiff-$ε$ and EFDiff-$x_0$, which offer complementary trade-offs between perceptual realism and pixel-level fidelity. We evaluate EFDiff under an extreme $32\times$ scale gap using a globally diverse benchmark comprising 242,416 co-registered Landsat thermal-reflectance patches. Results show that EFDiff consistently outperforms baseline methods and that cross-attention conditioning by EFM is more effective than HLS channel concatenation. Although we present EFDiff in the context of LST super-resolution, the framework is broadly applicable to remote sensing problems in which pretrained geospatial representations can guide generative reconstruction.

FLU-DYNApr 18
FlowRefiner: Flow Matching-Based Iterative Refinement for 3D Turbulent Flow Simulation

Yilong Dai, Yiming Sun, Yiheng Chen et al.

Accurate autoregressive prediction of 3D turbulent flows remains challenging for neural PDE solvers, as small errors in fine-scale structures can accumulate rapidly over rollout. In this paper, we propose FlowRefiner, a flow matching-based iterative refinement framework for 3D turbulent flow simulation. The method replaces stochastic denoising refinement with deterministic ODE-based correction, uses a unified velocity-field regression objective across all refinement stages, and introduces a decoupled sigma schedule that fixes the noise range independently of refinement depth. These design choices yield stable and effective refinement in the small-noise regime. Experiments on large-scale 3D turbulence with rich multi-scale structures show that FlowRefiner achieves state-of-the-art autoregressive prediction accuracy and strong physical consistency. Although developed for turbulent flow simulation, the proposed framework is broadly applicable to iterative refinement problems in scientific modeling.

SIMar 20
Politicized Attention Shifts Amplify Polarization in the Information Ecosystem during California Wildfires

Yiheng Chen, Alina Hagen, Fan Yang et al.

Wildfires require governments to communicate under conditions of urgency, uncertainty, and intense public scrutiny, yet such communication now unfolds within a digitally mediated environment shaped by polarization and engagement-based amplification. We analyze over 1.3 million wildfire-related social media posts from California (2016-2025) to examine how institutional actors are evaluated within this landscape. Users' stance toward government is actor-specific: individual political officials are discussed more negatively than operational agencies across federal, state, and local levels, and this gap widens during extreme wildfire events. Moreover, interaction networks become increasingly modular over time, consolidating into polarized communities in which negativity concentrates within cohesive clusters. Engagement-weighted measures show that highly interactive negative content disproportionately shapes visible discourse, while crisis periods redirect attention from emergency agencies to high-profile political figures, reinforcing reputational divergence. These findings indicate that wildfire communication operates within a polarized, engagement-ranked ecosystem in which evaluative tone, network structure, and visibility dynamics jointly shape institutional perception. Effective disaster communication should therefore account for the structural conditions of contemporary digital public communities.

ROMay 14
FU-MPC: Frontier- and Uncertainty-Aware Model Predictive Control for Efficient and Accurate UAV Exploration with Motorized LiDAR

Jianping Li, Pengfei Wan, Zhongyuan Liu et al.

Efficient UAV exploration in unknown environments requires rapid coverage expansion while maintaining accurate and reliable localization, since safe navigation in complex scenes depends on consistent mapping and pose estimation. However, for conventional LiDAR-equipped UAVs, the observable region is tightly coupled with the UAV pose and motion. Expanding coverage often requires additional translational or rotational maneuvers, which can reduce exploration efficiency and increase the risk of localization degradation in geometrically challenging environments. Motorized rotating LiDARs provide a promising solution by actively adjusting the sensor viewing direction without changing the UAV motion, thereby introducing an additional sensing degree of freedom. Nevertheless, existing exploration systems rarely exploit this scanning freedom as an explicit decision variable linked to both exploration progress and localization quality. To address this gap, we develop a UAV platform equipped with an independently actuated rotating LiDAR and propose a hierarchical exploration framework. The global planner organizes frontiers into representative viewpoints and sequences them using topology-aware transition costs. Built upon this planner, FU-MPC serves as a local receding-horizon scan controller that optimizes LiDAR rotation along the predicted flight trajectory. The controller jointly considers frontier-aware exploration utility and direction-dependent localization uncertainty, while lightweight surrogate evaluation enables real-time onboard execution. Experiments in complex environments demonstrate that the proposed system improves exploration efficiency while maintaining robust localization performance compared with fixed-pattern scanning and uncertainty-only baselines. The project page can be found at https://kafeiyin00.github.io/FU-MPC/.

AIOct 14, 2025
Empowering LLM Agents with Geospatial Awareness: Toward Grounded Reasoning for Wildfire Response

Yiheng Chen, Lingyao Li, Zihui Ma et al.

Effective disaster response is essential for safeguarding lives and property. Existing statistical approaches often lack semantic context, generalize poorly across events, and offer limited interpretability. While Large language models (LLMs) provide few-shot generalization, they remain text-bound and blind to geography. To bridge this gap, we introduce a Geospatial Awareness Layer (GAL) that grounds LLM agents in structured earth data. Starting from raw wildfire detections, GAL automatically retrieves and integrates infrastructure, demographic, terrain, and weather information from external geodatabases, assembling them into a concise, unit-annotated perception script. This enriched context enables agents to produce evidence-based resource-allocation recommendations (e.g., personnel assignments, budget allocations), further reinforced by historical analogs and daily change signals for incremental updates. We evaluate the framework in real wildfire scenarios across multiple LLM models, showing that geospatially grounded agents can outperform baselines. The proposed framework can generalize to other hazards such as floods and hurricanes.

LGMar 19, 2024
Cross-Domain Pre-training with Language Models for Transferable Time Series Representations

Mingyue Cheng, Xiaoyu Tao, Qi Liu et al.

Advancements in self-supervised pre-training (SSL) have significantly advanced the field of learning transferable time series representations, which can be very useful in enhancing the downstream task. Despite being effective, most existing works struggle to achieve cross-domain SSL pre-training, missing valuable opportunities to integrate patterns and features from different domains. The main challenge lies in the significant differences in the characteristics of time-series data across different domains, such as variations in the number of channels and temporal resolution scales. To address this challenge, we propose CrossTimeNet, a novel cross-domain SSL learning framework to learn transferable knowledge from various domains to largely benefit the target downstream task. One of the key characteristics of CrossTimeNet is the newly designed time series tokenization module, which could effectively convert the raw time series into a sequence of discrete tokens based on a reconstruction optimization process. Besides, we highlight that predicting a high proportion of corrupted tokens can be very helpful for extracting informative patterns across different domains during SSL pre-training, which has been largely overlooked in past years. Furthermore, unlike previous works, our work treats the pre-training language model (PLM) as the initialization of the encoder network, investigating the feasibility of transferring the knowledge learned by the PLM to the time series area. Through these efforts, the path to cross-domain pre-training of a generic time series model can be effectively paved. We conduct extensive experiments in a real-world scenario across various time series classification domains. The experimental results clearly confirm CrossTimeNet's superior performance.

LGMar 19, 2024
Advancing Time Series Classification with Multimodal Language Modeling

Mingyue Cheng, Yiheng Chen, Qi Liu et al.

For the advancements of time series classification, scrutinizing previous studies, most existing methods adopt a common learning-to-classify paradigm - a time series classifier model tries to learn the relation between sequence inputs and target label encoded by one-hot distribution. Although effective, this paradigm conceals two inherent limitations: (1) encoding target categories with one-hot distribution fails to reflect the comparability and similarity between labels, and (2) it is very difficult to learn transferable model across domains, which greatly hinder the development of universal serving paradigm. In this work, we propose InstructTime, a novel attempt to reshape time series classification as a learning-to-generate paradigm. Relying on the powerful generative capacity of the pre-trained language model, the core idea is to formulate the classification of time series as a multimodal understanding task, in which both task-specific instructions and raw time series are treated as multimodal inputs while the label information is represented by texts. To accomplish this goal, three distinct designs are developed in the InstructTime. Firstly, a time series discretization module is designed to convert continuous time series into a sequence of hard tokens to solve the inconsistency issue across modal inputs. To solve the modality representation gap issue, for one thing, we introduce an alignment projected layer before feeding the transformed token of time series into language models. For another, we highlight the necessity of auto-regressive pre-training across domains, which can facilitate the transferability of the language model and boost the generalization performance. Extensive experiments are conducted over benchmark datasets, whose results uncover the superior performance of InstructTime and the potential for a universal foundation model in time series classification.