Qi Xin

SE
h-index45
11papers
133citations
Novelty44%
AI Score52

11 Papers

75.5NAMay 23
A quasi-monolithic localized high-order ALE finite element method for multi-scale fluid-structure interaction problems

Lingyue Shen, Qi Xin, Yan Chen et al.

This paper presents a quasi-monolithic localized high-order arbitrary Lagrangian-Eulerian (qMLH-ALE) finite element method for multi-scale fluid-structure interaction (FSI) in microfluidic systems. The fluid momentum, the incompressible Neo-Hookean constitutive law, and the left Cauchy-Green tensor $\mathcal{B}$ are assembled into a single implicit system, while the harmonic mesh extension is updated explicitly in a staggered manner. Isoparametric $\mathcal{P}_2$ elements provide third-order geometric approximation of curved fluid-solid interfaces, and a second-order implicit-explicit partitioned Runge-Kutta scheme delivers second-order temporal accuracy without the dissipation of backward Euler. A localized updating strategy confines the moving mesh and the deformation history to a body-fitted sub-domain coupled with a precomputed steady background flow, bridging the scale disparity between local FSI dynamics and the macroscopic microchannel geometry. The Turek-Hron FSI3 benchmark, performed at unit fluid-solid density ratio, reproduces the reference beam-tip amplitude and frequency within $3\%$, confirming stability under the strong added-mass coupling that destabilizes conventional partitioned schemes. Three-dimensional particle-focusing simulations in spiral microchannels further illustrate the framework on long-range multi-scale problems.

CLFeb 4
ERNIE 5.0 Technical Report

Haifeng Wang, Hua Wu, Tian Wu et al.

In this report, we introduce ERNIE 5.0, a natively autoregressive foundation model desinged for unified multimodal understanding and generation across text, image, video, and audio. All modalities are trained from scratch under a unified next-group-of-tokens prediction objective, based on an ultra-sparse mixture-of-experts (MoE) architecture with modality-agnostic expert routing. To address practical challenges in large-scale deployment under diverse resource constraints, ERNIE 5.0 adopts a novel elastic training paradigm. Within a single pre-training run, the model learns a family of sub-models with varying depths, expert capacities, and routing sparsity, enabling flexible trade-offs among performance, model size, and inference latency in memory- or time-constrained scenarios. Moreover, we systematically address the challenges of scaling reinforcement learning to unified foundation models, thereby guaranteeing efficient and stable post-training under ultra-sparse MoE architectures and diverse multimodal settings. Extensive experiments demonstrate that ERNIE 5.0 achieves strong and balanced performance across multiple modalities. To the best of our knowledge, among publicly disclosed models, ERNIE 5.0 represents the first production-scale realization of a trillion-parameter unified autoregressive model that supports both multimodal understanding and generation. To facilitate further research, we present detailed visualizations of modality-agnostic expert routing in the unified model, alongside comprehensive empirical analysis of elastic training, aiming to offer profound insights to the community.

76.2NAApr 14
Sharp inf-sup estimate for the Stokes equation in tight domains with periodic pillars and some numerical implications

Qi Xin, Shihua Gong, Jinchao Xu

The predictive simulation of fluid dynamics in densely packed microfluidic devices, such as Deterministic Lateral Displacement (DLD) arrays, is severely bottlenecked by the stagnation of standard iterative solvers. In this paper, we reveal that this failure is not an algorithmic artifact, but fundamentally rooted in the pre-asymptotic degradation of the pressure-velocity coupling stability. By rigorously analyzing periodic pillar geometries in this generalized lattice framework, we prove that the continuous Ladyzhenskaya-Babuška-Brezzi (LBB) condition, also called the inf-sup constant, deteriorates exactly as $m^{-1}$ up to a positive multiplicative constant, where $m$ is the pillar density (the number of pillars per unit length). This causes a severe a priori error amplification and extreme ill-conditioning in Schur complement of the saddle point system. To overcome this theoretical limit, we propose a parameter-free, adaptively scaled Augmented Lagrangian (AL) stabilization strategy. Extensive numerical experiments on both standard square and highly asymmetric DLD arrays validate our theoretical bounds and demonstrate the robustness of the proposed AL method.

IRMar 28, 2024
Intelligent Classification and Personalized Recommendation of E-commerce Products Based on Machine Learning

Kangming Xu, Huiming Zhou, Haotian Zheng et al.

With the rapid evolution of the Internet and the exponential proliferation of information, users encounter information overload and the conundrum of choice. Personalized recommendation systems play a pivotal role in alleviating this burden by aiding users in filtering and selecting information tailored to their preferences and requirements. Such systems not only enhance user experience and satisfaction but also furnish opportunities for businesses and platforms to augment user engagement, sales, and advertising efficacy.This paper undertakes a comparative analysis between the operational mechanisms of traditional e-commerce commodity classification systems and personalized recommendation systems. It delineates the significance and application of personalized recommendation systems across e-commerce, content information, and media domains. Furthermore, it delves into the challenges confronting personalized recommendation systems in e-commerce, including data privacy, algorithmic bias, scalability, and the cold start problem. Strategies to address these challenges are elucidated.Subsequently, the paper outlines a personalized recommendation system leveraging the BERT model and nearest neighbor algorithm, specifically tailored to address the exigencies of the eBay e-commerce platform. The efficacy of this recommendation system is substantiated through manual evaluation, and a practical application operational guide and structured output recommendation results are furnished to ensure the system's operability and scalability.

89.1SEMay 7
SiblingRepair: Sibling-Based Multi-Hunk Repair with Large Language Models

Xinyu Liu, Jiayu Ren, Yusen Wang et al.

Developers often make similar mistakes across code locations implementing related functionalities. These locations, called siblings, share similar issues and require similar fixes. Accurately identifying siblings and consistently repairing them are crucial for automated program repair. Hercules is a SOTA technique designed for sibling repair. However, it is limited by strong assumptions about sibling locations and commit-history availability, rigid AST-based sibling matching, and inflexible template-based patch generation. To address these limitations, we present SiblingRepair, a new LLM-based multi-hunk APR technique specialized for sibling repair. Starting from a suspicious location identified by spectrum-based fault localization, SiblingRepair searches for semantically related sibling candidates using token- and embedding-based code matching, without restricting discovery to failing-test coverage or commit history. It then uses an LLM to identify failure-relevant siblings and generate consistent patches through two complementary strategies: simultaneous repair, which jointly repairs siblings, and iterative repair, which progressively analyzes candidates for patch construction. SiblingRepair further preserves promising patches generated from earlier suspicious locations and combines them into generalized multi-hunk patches. We evaluate SiblingRepair on the Defects4J and GHRB benchmarks. The results show that SiblingRepair substantially outperforms SOTA multi-hunk repair techniques including Hercules. Our evaluation further demonstrates its repair efficiency, the effectiveness of its sibling detection and repair components, and limited impact of the LLM data leakage on the results. Overall, SiblingRepair advances automated sibling and general multi-hunk repair.

74.5NCMar 15
D-MEM: Dopamine-Gated Agentic Memory via Reward Prediction Error Routing

Yuru Song, Qi Xin

Autonomous LLM agents require structured long-term memory, yet current "append-and-evolve" systems like A-MEM face O(N^2) write-latency and excessive token costs. We introduce D-MEM (Dopamine-Gated Agentic Memory), a biologically inspired architecture that decouples short-term interaction from cognitive restructuring via a Fast/Slow routing system based on Reward Prediction Error (RPE). A lightweight Critic Router evaluates stimuli for Surprise and Utility. Routine, low-RPE inputs are bypassed or cached in an O(1) fast-access buffer. Conversely, high-RPE inputs, such as factual contradictions or preference shifts, trigger a "dopamine" signal, activating the O(N) memory evolution pipeline to reshape the agent's knowledge graph. To evaluate performance under realistic conditions, we introduce the LoCoMo-Noise benchmark, which injects controlled conversational noise into long-term sessions. Evaluations demonstrate that D-MEM reduces token consumption by over 80%, eliminates O(N^2) bottlenecks, and outperforms baselines in multi-hop reasoning and adversarial resilience. By selectively gating cognitive restructuring, D-MEM provides a scalable, cost-efficient foundation for lifelong agentic memory.

ARMay 9, 2025
What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips

Renjie Li, Wenjie Wei, Qi Xin et al.

Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware. For example, training GPT-3 has been estimated to consume around 1300 MWh of electricity, and projections suggest future models may require city-scale (gigawatt) power budgets. These demands motivate exploration of computing paradigms beyond conventional von Neumann architectures. This review surveys emerging photonic hardware optimized for next-generation generative AI computing. We discuss integrated photonic neural network architectures (e.g., Mach-Zehnder interferometer meshes, lasers, wavelength-multiplexed microring resonators) that perform ultrafast matrix operations. We also examine promising alternative neuromorphic devices, including spiking neural network circuits and hybrid spintronic-photonic synapses, which combine memory and processing. The integration of two-dimensional materials (graphene, TMDCs) into silicon photonic platforms is reviewed for tunable modulators and on-chip synaptic elements. Transformer-based LLM architectures (self-attention and feed-forward layers) are analyzed in this context, identifying strategies and challenges for mapping dynamic matrix multiplications onto these novel hardware substrates. We then dissect the mechanisms of mainstream LLMs, such as ChatGPT, DeepSeek, and LLaMA, highlighting their architectural similarities and differences. We synthesize state-of-the-art components, algorithms, and integration methods, highlighting key advances and open issues in scaling such systems to mega-sized LLM models. We find that photonic computing systems could potentially surpass electronic processors by orders of magnitude in throughput and energy efficiency, but require breakthroughs in memory, especially for long-context windows and long token sequences, and in storage of ultra-large datasets.

CVJun 25, 2024
Optimization of Autonomous Driving Image Detection Based on RFAConv and Triplet Attention

Zhipeng Ling, Qi Xin, Yiyu Lin et al.

YOLOv8 plays a crucial role in the realm of autonomous driving, owing to its high-speed target detection, precise identification and positioning, and versatile compatibility across multiple platforms. By processing video streams or images in real-time, YOLOv8 rapidly and accurately identifies obstacles such as vehicles and pedestrians on roadways, offering essential visual data for autonomous driving systems. Moreover, YOLOv8 supports various tasks including instance segmentation, image classification, and attitude estimation, thereby providing comprehensive visual perception for autonomous driving, ultimately enhancing driving safety and efficiency. Recognizing the significance of object detection in autonomous driving scenarios and the challenges faced by existing methods, this paper proposes a holistic approach to enhance the YOLOv8 model. The study introduces two pivotal modifications: the C2f_RFAConv module and the Triplet Attention mechanism. Firstly, the proposed modifications are elaborated upon in the methodological section. The C2f_RFAConv module replaces the original module to enhance feature extraction efficiency, while the Triplet Attention mechanism enhances feature focus. Subsequently, the experimental procedure delineates the training and evaluation process, encompassing training the original YOLOv8, integrating modified modules, and assessing performance improvements using metrics and PR curves. The results demonstrate the efficacy of the modifications, with the improved YOLOv8 model exhibiting significant performance enhancements, including increased MAP values and improvements in PR curves. Lastly, the analysis section elucidates the results and attributes the performance improvements to the introduced modules. C2f_RFAConv enhances feature extraction efficiency, while Triplet Attention improves feature focus for enhanced target detection.

SEFeb 11, 2022
A Quick Repair Facility for Debugging

Steven P. Reiss, Qi Xin

Modern development environments provide a widely used auto-correction facility for quickly repairing syntactic errors. Auto-correction cannot deal with semantic errors, which are much more difficult to repair. Automated program repair techniques, designed for repairing semantic errors, are not well-suited for interactive use while debugging, as they typically assume the existence of a high-quality test suite and take considerable time. To bridge the gap, we developed ROSE, a tool to suggest quick-yet-effective repairs of semantic errors during debugging. ROSE does not rely on a test suite. Instead, it assumes a debugger stopping point where a problem is observed. It asks the developer to quickly describe what is wrong, performs a light-weight fault localization to identify potential responsible locations, and uses a generate-and-validate strategy to produce and validate repairs. Finally, it presents the results so the developer can choose and make the appropriate repair. To assess its utility, we implemented a prototype of ROSE that works in the Eclipse IDE and applied it to two benchmarks, QuixBugs and Defects4J, for repair. ROSE was able to suggest correct repairs for 17 QuixBugs and 16 Defects4J errors in seconds.

SEMar 11, 2019
Revisiting ssFix for Better Program Repair

Qi Xin, Steven P. Reiss

A branch of automated program repair (APR) techniques look at finding and reusing existing code for bug repair. ssFix is one of such techniques that is syntactic search-based: it searches a code database for code fragments that are syntactically similar to the bug context and reuses such retrieved code fragments to produce patches. Using such a syntactic approach, ssFix is relatively lightweight and was shown to outperform many other APR techniques. In this paper, to investigate the true effectiveness of ssFix, we conducted multiple experiments to validate ssFix's built-upon assumption (i.e., to see whether it is often possible to reuse existing code for bug repair) and evaluate its code search and code reuse approaches. Our results show that while the basic idea of ssFix, i.e., reusing existing code for bug repair, is promising, the approaches ssFix uses are not the best and can be significantly improved. We proposed a new repair technique sharpFix which follows ssFix's basic idea but differs in the code search and reuse approaches used. We evaluated sharpFix and ssFix on two bug datasets: Defects4J and Bugs.jar-ELIXIR. The results confirm that sharpFix is an improvement over ssFix. For the Defects4J dataset, sharpFix successfully repaired a total of 36 bugs and outperformed many existing repair techniques in repairing more bugs. For the Bugs.jar-ELIXIR dataset, we compared sharpFix, ssFix, and four other APR techniques, and found that sharpFix has the best repair performance. In essence, the paper shows how effective a syntactic search-based approach can be and what techniques should be used for such an approach.

SEJul 21, 2017
Learning Program Component Order

Steven P. Reiss, Qi Xin

Successful programs are written to be maintained. One aspect to this is that programmers order the components in the code files in a particular way. This is part of programming style. While the conventions for ordering are sometimes given as part of a style guideline, such guidelines are often incomplete and programmers tend to have their own more comprehensive orderings in mind. This paper defines a model for ordering program components and shows how this model can be learned from sample code. Such a model is a useful tool for a programming environment in that it can be used to find the proper location for inserting new components or for reordering files to better meet the needs of the programmer. The model is designed so that it can be fine- tuned by the programmer. The learning framework is evaluated both by looking at code with known style guidelines and by testing whether it inserts existing components into a file correctly.