Shiyi Chen

h-index88

11papers

67citations

Novelty56%

AI Score39

Ranked #79,686 of 194,257 authors (top 41%)#17,748 in LG (top 44%)

11 Papers

6.6FLU-DYNOct 21, 2022

Multi-scale data reconstruction of turbulent rotating flows with Gappy POD, Extended POD and Generative Adversarial Networks

Tianyi Li, Michele Buzzicotti, Luca Biferale et al.

Data reconstruction of rotating turbulent snapshots is investigated utilizing data-driven tools. This problem is crucial for numerous geophysical applications and fundamental aspects, given the concurrent effects of direct and inverse energy cascades, which lead to non-Gaussian statistics at both large and small scales. Data assimilation also serves as a tool to rank physical features within turbulence, by evaluating the performance of reconstruction in terms of the quality and quantity of the information used. Additionally, benchmarking various reconstruction techniques is essential to assess the trade-off between quantitative supremacy, implementation complexity, and explicability. In this study, we use linear and non-linear tools based on the Proper Orthogonal Decomposition (POD) and Generative Adversarial Network (GAN) for reconstructing rotating turbulence snapshots with spatial damages (inpainting). We focus on accurately reproducing both statistical properties and instantaneous velocity fields. Different gap sizes and gap geometries are investigated in order to assess the importance of coherency and multi-scale properties of the missing information. Surprisingly enough, concerning point-wise reconstruction, the non-linear GAN does not outperform one of the linear POD techniques. On the other hand, supremacy of the GAN approach is shown when the statistical multi-scale properties are compared. Similarly, extreme events in the gap region are better predicted when using GAN. The balance between point-wise error and statistical properties is controlled by the adversarial ratio, which determines the relative importance of the generator and the discriminator in the GAN training. Robustness against the measurement noise is also discussed.

1.2FLU-DYNMar 6, 2023Code

Stabilizing the Maximal Entropy Moment Method for Rarefied Gas Dynamics at Single-Precision

Candi Zheng, Wang Yang, Shiyi Chen

The maximal entropy moment method (MEM) is systematic solution of the challenging problem: generating extended hydrodynamic equations valid for both dense and rarefied gases. However, simulating MEM suffers from a computational expensive and ill-conditioned maximal entropy problem. It causes numerical overflow and breakdown when the numerical precision is insufficient, especially for flows like high-speed shock waves. It also prevents modern GPUs from accelerating MEM with their enormous single floating-point precision computation power. This paper aims to stabilize MEM, making it possible to simulating very strong normal shock waves on modern GPUs at single precision. We improve the condition number of the maximal entropy problem by proposing gauge transformations, which moves not only flow fields but also hydrodynamic equations into a more optimal coordinate system. We addressed numerical overflow and breakdown in the maximal entropy problem by employing the canonical form of distribution and a modified Newton optimization method. Moreover, we discovered a counter-intuitive phenomenon that over-refined spatial mesh beyond mean free path degrades the stability of MEM. With these techniques, we accomplished single-precision GPU simulations of high speed shock wave up to Mach 10 utilizing 35 moments MEM, while previous methods only achieved Mach 4 on double-precision.

9.2LGAug 29, 2024

Spectral Informed Neural Network: An Efficient and Low-Memory PINN

Tianchi Yu, Yiming Qi, Ivan Oseledets et al.

With growing investigations into solving partial differential equations by physics-informed neural networks (PINNs), more accurate and efficient PINNs are required to meet the practical demands of scientific computing. One bottleneck of current PINNs is computing the high-order derivatives via automatic differentiation which often necessitates substantial computing resources. In this paper, we focus on removing the automatic differentiation of the spatial derivatives and propose a spectral-based neural network that substitutes the differential operator with a multiplication. Compared to the PINNs, our approach requires lower memory and shorter training time. Thanks to the exponential convergence of the spectral basis, our approach is more accurate. Moreover, to handle the different situations between physics domain and spectral domain, we provide two strategies to train networks by their spectral information. Through a series of comprehensive experiments, We validate the aforementioned merits of our proposed network.

14.7CLOct 26, 2025Code

Adaptive Testing for LLM Evaluation: A Psychometric Alternative to Static Benchmarks

Peiyu Li, Xiuxiu Tang, Si Chen et al.

Large language model evaluation requires thousands of benchmark items, making evaluations expensive and slow. Existing methods compute average accuracy across fixed item sets, treating all items equally despite varying quality and informativeness. We present ATLAS an adaptive testing framework using Item Response Theory (IRT) to estimate model ability through Fisher information-guided item selection. Our analysis of five major benchmarks reveals that 3-6% of items exhibit negative discrimination, indicating annotation errors that corrupt static evaluation. ATLAS achieves 90% item reduction while maintaining measurement precision: on HellaSwag (5,608 items), we match full-benchmark estimates using only 42 items with 0.154 MAE. Our framework maintains item exposure rates below 10% and test overlap at 16-27%, compared to static benchmarks where every model sees all items (100% exposure). Among 4,000+ tested models, IRT ranks differ from accuracy ranks: models with the same accuracy get different IRT scores, and 23-31% of all models shift by more than 10 rank positions. Code and calibrated item banks are available at https://github.com/Peiyu-Georgia-Li/ATLAS.git.

2.6LGAug 8, 2024

Early Risk Assessment Model for ICA Timing Strategy in Unstable Angina Patients Using Multi-Modal Machine Learning

Candi Zheng, Kun Liu, Yang Wang et al.

Background: Invasive coronary arteriography (ICA) is recognized as the gold standard for diagnosing cardiovascular diseases, including unstable angina (UA). The challenge lies in determining the optimal timing for ICA in UA patients, balancing the need for revascularization in high-risk patients against the potential complications in low-risk ones. Unlike myocardial infarction, UA does not have specific indicators like ST-segment deviation or cardiac enzymes, making risk assessment complex. Objectives: Our study aims to enhance the early risk assessment for UA patients by utilizing machine learning algorithms. These algorithms can potentially identify patients who would benefit most from ICA by analyzing less specific yet related indicators that are challenging for human physicians to interpret. Methods: We collected data from 640 UA patients at Shanghai General Hospital, including medical history and electrocardiograms (ECG). Machine learning algorithms were trained using multi-modal demographic characteristics including clinical risk factors, symptoms, biomarker levels, and ECG features extracted by pre-trained neural networks. The goal was to stratify patients based on their revascularization risk. Additionally, we translated our models into applicable and explainable look-up tables through discretization for practical clinical use. Results: The study achieved an Area Under the Curve (AUC) of $0.719 \pm 0.065$ in risk stratification, significantly surpassing the widely adopted GRACE score's AUC of $0.579 \pm 0.044$. Conclusions: The results suggest that machine learning can provide superior risk stratification for UA patients. This improved stratification could help in balancing the risks, costs, and complications associated with ICA, indicating a potential shift in clinical assessment practices for unstable angina.

11.8IRFeb 1, 2025

MIM: Multi-modal Content Interest Modeling Paradigm for User Behavior Modeling

Bencheng Yan, Si Chen, Shichang Jia et al.

Click-Through Rate (CTR) prediction is a crucial task in recommendation systems, online searches, and advertising platforms, where accurately capturing users' real interests in content is essential for performance. However, existing methods heavily rely on ID embeddings, which fail to reflect users' true preferences for content such as images and titles. This limitation becomes particularly evident in cold-start and long-tail scenarios, where traditional approaches struggle to deliver effective results. To address these challenges, we propose a novel Multi-modal Content Interest Modeling paradigm (MIM), which consists of three key stages: Pre-training, Content-Interest-Aware Supervised Fine-Tuning (C-SFT), and Content-Interest-Aware UBM (CiUBM). The pre-training stage adapts foundational models to domain-specific data, enabling the extraction of high-quality multi-modal embeddings. The C-SFT stage bridges the semantic gap between content and user interests by leveraging user behavior signals to guide the alignment of embeddings with user preferences. Finally, the CiUBM stage integrates multi-modal embeddings and ID-based collaborative filtering signals into a unified framework. Comprehensive offline experiments and online A/B tests conducted on the Taobao, one of the world's largest e-commerce platforms, demonstrated the effectiveness and efficiency of MIM method. The method has been successfully deployed online, achieving a significant increase of +14.14% in CTR and +4.12% in RPM, showcasing its industrial applicability and substantial impact on platform performance. To promote further research, we have publicly released the code and dataset at https://pan.quark.cn/s/8fc8ec3e74f3.

7.9LGNov 27, 2024

Diffeomorphic Latent Neural Operators for Data-Efficient Learning of Solutions to Partial Differential Equations

Zan Ahmad, Shiyi Chen, Minglang Yin et al.

A computed approximation of the solution operator to a system of partial differential equations (PDEs) is needed in various areas of science and engineering. Neural operators have been shown to be quite effective at predicting these solution generators after training on high-fidelity ground truth data (e.g. numerical simulations). However, in order to generalize well to unseen spatial domains, neural operators must be trained on an extensive amount of geometrically varying data samples that may not be feasible to acquire or simulate in certain contexts (e.g., patient-specific medical data, large-scale computationally intensive simulations.) We propose that in order to learn a PDE solution operator that can generalize across multiple domains without needing to sample enough data expressive enough for all possible geometries, we can train instead a latent neural operator on just a few ground truth solution fields diffeomorphically mapped from different geometric/spatial domains to a fixed reference configuration. Furthermore, the form of the solutions is dependent on the choice of mapping to and from the reference domain. We emphasize that preserving properties of the differential operator when constructing these mappings can significantly reduce the data requirement for achieving an accurate model due to the regularity of the solution fields that the latent neural operator is training on. We provide motivating numerical experimentation that demonstrates an extreme case of this consideration by exploiting the conformal invariance of the Laplacian

1.2COMP-PHDec 21, 2024Code

An explainable operator approximation framework under the guideline of Green's function

Jianghang Gu, Ling Wen, Yuntian Chen et al.

Traditional numerical methods, such as the finite element method and finite volume method, adress partial differential equations (PDEs) by discretizing them into algebraic equations and solving these iteratively. However, this process is often computationally expensive and time-consuming. An alternative approach involves transforming PDEs into integral equations and solving them using Green's functions, which provide analytical solutions. Nevertheless, deriving Green's functions analytically is a challenging and non-trivial task, particularly for complex systems. In this study, we introduce a novel framework, termed GreensONet, which is constructed based on the strucutre of deep operator networks (DeepONet) to learn embedded Green's functions and solve PDEs via Green's integral formulation. Specifically, the Trunk Net within GreensONet is designed to approximate the unknown Green's functions of the system, while the Branch Net are utilized to approximate the auxiliary gradients of the Green's function. These outputs are subsequently employed to perform surface integrals and volume integrals, incorporating user-defined boundary conditions and source terms, respectively. The effectiveness of the proposed framework is demonstrated on three types of PDEs in bounded domains: 3D heat conduction equations, reaction-diffusion equations, and Stokes equations. Comparative results in these cases demonstrate that GreenONet's accuracy and generalization ability surpass those of existing methods, including Physics-Informed Neural Networks (PINN), DeepONet, Physics-Informed DeepONet (PI-DeepONet), and Fourier Neural Operators (FNO).

3.3FLU-DYNJun 2, 2024

Discovering an interpretable mathematical expression for a full wind-turbine wake with artificial intelligence enhanced symbolic regression

Ding Wang, Yuntian Chen, Shiyi Chen

The rapid expansion of wind power worldwide underscores the critical significance of engineering-focused analytical wake models in both the design and operation of wind farms. These theoretically-derived ana lytical wake models have limited predictive capabilities, particularly in the near-wake region close to the turbine rotor, due to assumptions that do not hold. Knowledge discovery methods can bridge these gaps by extracting insights, adjusting for theoretical assumptions, and developing accurate models for physical processes. In this study, we introduce a genetic symbolic regression (SR) algorithm to discover an interpretable mathematical expression for the mean velocity deficit throughout the wake, a previously unavailable insight. By incorporating a double Gaussian distribution into the SR algorithm as domain knowledge and designing a hierarchical equation structure, the search space is reduced, thus efficiently finding a concise, physically informed, and robust wake model. The proposed mathematical expression (equation) can predict the wake velocity deficit at any location in the full-wake region with high precision and stability. The model's effectiveness and practicality are validated through experimental data and high-fidelity numerical simulations.

1.2FLU-DYNAug 1, 2021

Data-Driven Constitutive Relation Reveals Scaling Law for Hydrodynamic Transport Coefficients

Candi Zheng, Yang Wang, Shiyi Chen

Finding extended hydrodynamics equations valid from the dense gas region to the rarefied gas region remains a great challenge. The key to success is to obtain accurate constitutive relations for stress and heat flux. Data-driven models offer a new phenomenological approach to learning constitutive relations from data. Such models enable complex constitutive relations that extend Newton's law of viscosity and Fourier's law of heat conduction by regression on higher derivatives. However, the choices of derivatives in these models are ad-hoc without a clear physical explanation. We investigated data-driven models theoretically on a linear system. We argue that these models are equivalent to non-linear length scale scaling laws of transport coefficients. The equivalence to scaling laws justified the physical plausibility and revealed the limitation of data-driven models. Our argument also points out that modeling the scaling law could avoid practical difficulties in data-driven models like derivative estimation and variable selection on noisy data. We further proposed a constitutive relation model based on scaling law and tested it on the calculation of Rayleigh scattering spectra. The result shows our data-driven model has a clear advantage over the Chapman-Enskog expansion and moment methods.

2.7LGSep 23, 2019

Heterogeneous-Temporal Graph Convolutional Networks: Make the Community Detection Much Better

Yaping Zheng, Shiyi Chen, Xinni Zhang et al.

Community detection has long been an important yet challenging task to analyze complex networks with a focus on detecting topological structures of graph data. Essentially, real-world graph data contains various features, node and edge types which dynamically vary over time, and this invalidates most existing community detection approaches. To cope with these issues, this paper proposes the heterogeneous-temporal graph convolutional networks (HTGCN) to detect communities from hetergeneous and temporal graphs. Particularly, we first design a heterogeneous GCN component to acquire feature representations for each heterogeneous graph at each time step. Then, a residual compressed aggregation component is proposed to represent "dynamic" features for "varying" communities, which are then aggregated with "static" features extracted from current graph. Extensive experiments are evaluated on two real-world datasets, i.e., DBLP and IMDB. The promising results demonstrate that the proposed HTGCN is superior to both benchmark and the state-of-the-art approaches, e.g., GCN, GAT, GNN, LGNN, HAN and STAR, with respect to a number of evaluation criteria.