Xiaoyong Li

LG
h-index16
8papers
9citations
Novelty48%
AI Score44

8 Papers

LGAug 21, 2024
Benchmarking AI-based data assimilation to advance data-driven global weather forecasting

Wuxin Wang, Weicheng Ni, Ben Fei et al.

Research on Artificial Intelligence (AI)-based Data Assimilation (DA) is expanding rapidly. However, the absence of an objective, comprehensive, and real-world benchmark hinders the fair comparison of diverse methods. Here, we introduce DABench, a benchmark designed for contributing to the development and evaluation of AI-based DA methods. By integrating real-world observations, DABench provides an objective and fair platform for validating long-term closed-loop DA cycles, supporting both deterministic and ensemble configurations. Furthermore, we assess the efficacy of AI-based DA in generating initial conditions for the advanced AI-based weather forecasting model to produce accurate medium-range global weather forecasting. Our dual-validation, utilizing both reanalysis data and independent radiosonde observations, demonstrates that AI-based DA achieves performance competitive with state-of-the-art AI-driven four-dimensional variational frameworks across both global weather DA and medium-range forecasting metrics. We invite the research community to utilize DABench to accelerate the advancement of AI-based DA for global weather forecasting.

CEMay 19
RefiningGPT: Specialized language Models for Automated Refinery Unit-level Process Diagram Synthesis

Dongxiao Liu, Yuwen Ding, Xinghai Wei et al.

Applying LLMs to complex industrial processes remains challenging due to the semantic gap between natural language design intents and the rigorous physical logic of engineering. In the field of petroleum refining engineering, a critical bottleneck is the automated synthesis of Unit-level Process Diagrams (UPDs), which serve as the topological bridge connecting abstract requirements to concrete unit operations. In this paper, we propose RefineGPT, a domain-specialized agent for autonomous refinery design.RefineGPT adopts a hierarchical architecture in which a supervised fine-tuned small language model is responsible for selecting units that satisfy design requirements, while a large language model is used to connect these units to generate the final topology. To enable supervised training, we develop a pipeline that extracts latent process motifs from noisy, unstructured legacy topologies and synthesizes high-quality rationale-based Chain-of-Thought (CoT) training data. Empirical validation demonstrates that RefineGPT achieves substantial improvements in topological consistency and chemical engineering feasibility, establishing a high-fidelity pathway for AI-augmented industrial process synthesis.

CVMar 19
Mind the Rarities: Can Rare Skin Diseases Be Reliably Diagnosed via Diagnostic Reasoning?

Yang Liu, Jiyao Yang, Hongjin Zhao et al.

Large vision-language models (LVLMs) demonstrate strong performance in dermatology; however, evaluating diagnostic reasoning for rare conditions remains largely unexplored. Existing benchmarks focus on common diseases and assess only final accuracy, overlooking the clinical reasoning process, which is critical for complex cases. We address this gap by constructing DermCase, a long-context benchmark derived from peer-reviewed case reports. Our dataset contains 26,030 multi-modal image-text pairs and 6,354 clinically challenging cases, each annotated with comprehensive clinical information and step-by-step reasoning chains. To enable reliable evaluation, we establish DermLIP-based similarity metrics that achieve stronger alignment with dermatologists for assessing differential diagnosis quality. Benchmarking 22 leading LVLMs exposes significant deficiencies across diagnosis accuracy, differential diagnosis, and clinical reasoning. Fine-tuning experiments demonstrate that instruction tuning substantially improves performance while Direct Preference Optimization (DPO) yields minimal gains. Systematic error analysis further reveals critical limitations in current models' reasoning capabilities.

LGAug 5, 2024
RCDM: Enabling Robustness for Conditional Diffusion Model

Weifeng Xu, Xiang Zhu, Xiaoyong Li

The conditional diffusion model (CDM) enhances the standard diffusion model by providing more control, improving the quality and relevance of the outputs, and making the model adaptable to a wider range of complex tasks. However, inaccurate conditional inputs in the inverse process of CDM can easily lead to generating fixed errors in the neural network, which diminishes the adaptability of a well-trained model. The existing methods like data augmentation, adversarial training, robust optimization can improve the robustness, while they often face challenges such as high computational complexity, limited applicability to unknown perturbations, and increased training difficulty. In this paper, we propose a lightweight solution, the Robust Conditional Diffusion Model (RCDM), based on control theory to dynamically reduce the impact of noise and significantly enhance the model's robustness. RCDM leverages the collaborative interaction between two neural networks, along with optimal control strategies derived from control theory, to optimize the weights of two networks during the sampling process. Unlike conventional techniques, RCDM establishes a mathematical relationship between fixed errors and the weights of the two neural networks without incurring additional computational overhead. Extensive experiments were conducted on MNIST and CIFAR-10 datasets, and the results demonstrate the effectiveness and adaptability of our proposed model.

LGJul 12, 2025
XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

Wuxin Wang, Weicheng Ni, Lilan Huang et al.

Recent advancements in Artificial Intelligence (AI) demonstrate significant potential to revolutionize weather forecasting. However, most AI-driven models rely on Numerical Weather Prediction (NWP) systems for initial condition preparation, which often consumes hours on supercomputers. Here we introduce XiChen, the first observation-scalable fully AI-driven global weather forecasting system, whose entire pipeline, from Data Assimilation (DA) to medium-range forecasting, can be accomplished within only 17 seconds. XiChen is built upon a foundation model that is pre-trained for weather forecasting. Meanwhile, this model is subsequently fine-tuned to serve as both observation operators and DA models, thereby scalably assimilating conventional and raw satellite observations. Furthermore, the integration of four-dimensional variational knowledge ensures that XiChen's DA and medium-range forecasting accuracy rivals that of operational NWP systems, amazingly achieving a skillful forecasting lead time exceeding 8.25 days. These findings demonstrate that XiChen holds strong potential toward fully AI-driven weather forecasting independent of NWP systems.

CVDec 30, 2024
Generalize Your Face Forgery Detectors: An Insertable Adaptation Module Is All You Need

Xiaotian Si, Linghui Li, Liwei Zhang et al.

A plethora of face forgery detectors exist to tackle facial deepfake risks. However, their practical application is hindered by the challenge of generalizing to forgeries unseen during the training stage. To this end, we introduce an insertable adaptation module that can adapt a trained off-the-shelf detector using only online unlabeled test data, without requiring modifications to the architecture or training process. Specifically, we first present a learnable class prototype-based classifier that generates predictions from the revised features and prototypes, enabling effective handling of various forgery clues and domain gaps during online testing. Additionally, we propose a nearest feature calibrator to further improve prediction accuracy and reduce the impact of noisy pseudo-labels during self-training. Experiments across multiple datasets show that our module achieves superior generalization compared to state-of-the-art methods. Moreover, it functions as a plug-and-play component that can be combined with various detectors to enhance the overall performance.

LGMar 10, 2021
A Local Similarity-Preserving Framework for Nonlinear Dimensionality Reduction with Neural Networks

Xiang Wang, Xiaoyong Li, Junxing Zhu et al.

Real-world data usually have high dimensionality and it is important to mitigate the curse of dimensionality. High-dimensional data are usually in a coherent structure and make the data in relatively small true degrees of freedom. There are global and local dimensionality reduction methods to alleviate the problem. Most of existing methods for local dimensionality reduction obtain an embedding with the eigenvalue or singular value decomposition, where the computational complexities are very high for a large amount of data. Here we propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction, which generalizes recent advancements in embedding representation learning of words to dimensionality reduction of matrices. It obtains the nonlinear embedding using a neural network with only one hidden layer to reduce the computational complexity. To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points by exploiting the random walk properties. Experiments demenstrate that Vec2vec is more efficient than several state-of-the-art local dimensionality reduction methods in a large number of high-dimensional data. Extensive experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test, and it is competitive with recently developed state-of-the-art UMAP.

IRSep 1, 2019
Employ Multimodal Machine Learning for Content quality analysis

Eric Du, Xiaoyong Li

The task of identifying high-quality content becomes increasingly important, and it can improve overall reading time and CTR(click-through rate estimates). Generalizes quality analysis only focused on single Modal,such as image or text,but in today's mainstream media sites a lot of information is presented in graphic form.In this paper we propose a MultiModal quality recognition approach for the quality score. First we use two feature extractors,one for image and another for the text. After that we use an Siamese Network with the rank loss as the optimization objective.Compare with other approach,our approach get a more accuracy result.