Qin Li

LG
h-index28
88papers
909citations
Novelty46%
AI Score56

88 Papers

90.0SEMar 22Code
From Natural Language to Executable Properties for Property-based Testing of Mobile Apps

Yiheng Xiong, Ting Su, Jingling Sun et al.

Property-based testing (PBT) is a popular software testing methodology and is effective in validating the functionality of mobile applications (apps for short). However, its adoption in practice remains limited, largely due to the manual effort and technical expertise required to specify executable properties. In this experience paper, we propose a novel structured property synthesis approach that automatically translates property descriptions in natural language into executable properties, and implement it in a tool named iPBT. Our approach decomposes the problem into UI semantic grounding and executable property synthesis. It first builds an enriched widget context via multimodal LLMs to align visual elements with their functional semantics, and then uses an LLM with in-context learning to generate framework-specific executable properties. We evaluate iPBT with a closed-source LLM (GPT-4o) and an open-source LLM (DeepSeek-V3) on 124 diverse property descriptions derived from an existing benchmark dataset. iPBT achieves 95.2% (118/124) accuracy on both LLMs. Notably, an ablation study reveals that the enriched widget context contributes to an absolute improvement of up to 20.2% (from 75.0% to 95.2%). A user study with 10 participants demonstrates that iPBT reduces the time required to write executable properties by 56%, suggesting substantially lower manual effort. Furthermore, evaluations on 1,180 linguistically diverse variations demonstrate iPBT's robustness (87.6% accuracy), indicating its capability to handle varied expressions.

NADec 5, 2016
Exploring the locally low dimensional structure in solving random elliptic PDEs

Thomas Y. Hou, Qin Li, Pengchuan Zhang

We propose a stochastic multiscale finite element method (StoMsFEM) to solve random elliptic partial differential equations with a high stochastic dimension. The key idea is to simultaneously upscale the stochastic solutions in the physical space for all random samples and explore the low stochastic dimensions of the stochastic solution within each local patch. We propose two effective methods to achieve this simultaneous local upscaling. The first method is a high order interpolation method in the stochastic space that explores the high regularity of the local upscaled quantities with respect to the random variables. The second method is a reduced-order method that explores the low rank property of the multiscale basis functions within each coarse grid patch. Our complexity analysis shows that compared with the standard FEM on a fine grid, the StoMsFEM can achieve computational saving in the order of $(H/h)^{d}/(\log(H/h))^k$, where $H/h$ is the ratio between the coarse and the fine gird sizes, $d$ is the physical dimension and $k$ is the local stochastic dimension. Several numerical examples are presented to demonstrate the accuracy and effectiveness of the proposed methods. In the high contrast example, we observe a factor of 2000 speed-up.

OCDec 19, 2018
Randomized sampling for basis functions construction in generalized finite element methods

Ke Chen, Qin Li, Jianfeng Lu et al.

In the framework of generalized finite element methods for elliptic equations with rough coefficients, efficiency and accuracy of the numerical method depend critically on the use of appropriate basis functions. This work explores several random sampling strategies that construct approximations to the optimal set of basis functions of a given dimension, and proposes a quantitative criterion to analyze and compare these sampling strategies. Numerical evidence shows that the best results are achieved by two strategies, Random Gaussian and Smooth boundary sampling.

SRMar 27, 2022
Predicting Solar Energetic Particles Using SDO/HMI Vector Magnetic Data Products and a Bidirectional LSTM Network

Yasser Abduallah, Vania K. Jordanova, Hao Liu et al.

Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a coronal mass ejection (CME) associated with the flare, or (ii) the AR will produce an M- or X-class flare regardless of whether or not the flare is associated with a CME. The data samples used in this study are collected from the Geostationary Operational Environmental Satellite's X-ray flare catalogs provided by the National Centers for Environmental Information. We select M- and X-class flares with identified ARs in the catalogs for the period between 2010 and 2021, and find the associations of flares, CMEs and SEPs in the Space Weather Database of Notifications, Knowledge, Information during the same period. Each data sample contains physical parameters collected from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory. Experimental results based on different performance metrics demonstrate that the proposed biLSTM network is better than related machine learning algorithms for the two SEP prediction tasks studied here. We also discuss extensions of our approach for probabilistic forecasting and calibration with empirical evaluation.

MTRL-SCIJul 10, 2023Code
MD-HIT: Machine learning for materials property prediction with dataset redundancy control

Qin Li, Nihang Fu, Sadman Sadeed Omee et al.

Materials datasets are usually featured by the existence of many redundant (highly similar) materials due to the tinkering material design practice over the history of materials research. For example, the materials project database has many perovskite cubic structure materials similar to SrTiO$_3$. This sample redundancy within the dataset makes the random splitting of machine learning model evaluation to fail so that the ML models tend to achieve over-estimated predictive performance which is misleading for the materials science community. This issue is well known in the field of bioinformatics for protein function prediction, in which a redundancy reduction procedure (CD-Hit) is always applied to reduce the sample redundancy by ensuring no pair of samples has a sequence similarity greater than a given threshold. This paper surveys the overestimated ML performance in the literature for both composition based and structure based material property prediction. We then propose a material dataset redundancy reduction algorithm called MD-HIT and evaluate it with several composition and structure based distance threshold sfor reducing data set sample redundancy. We show that with this control, the predicted performance tends to better reflect their true prediction capability. Our MD-hit code can be freely accessed at https://github.com/usccolumbia/MD-HIT

NADec 5, 2016
A sparse decomposition of low rank symmetric positive semi-definite matrices

Thomas Y. Hou, Qin Li, Pengchuan Zhang

Suppose that $A \in \mathbb{R}^{N \times N}$ is symmetric positive semidefinite with rank $K \le N$. Our goal is to decompose $A$ into $K$ rank-one matrices $\sum_{k=1}^K g_k g_k^T$ where the modes $\{g_{k}\}_{k=1}^K$ are required to be as sparse as possible. In contrast to eigen decomposition, these sparse modes are not required to be orthogonal. Such a problem arises in random field parametrization where $A$ is the covariance function and is intractable to solve in general. In this paper, we partition the indices from 1 to $N$ into several patches and propose to quantify the sparseness of a vector by the number of patches on which it is nonzero, which is called patch-wise sparseness. Our aim is to find the decomposition which minimizes the total patch-wise sparseness of the decomposed modes. We propose a domain-decomposition type method, called intrinsic sparse mode decomposition (ISMD), which follows the "local-modes-construction + patching-up" procedure. The key step in the ISMD is to construct local pieces of the intrinsic sparse modes by a joint diagonalization problem. Thereafter a pivoted Cholesky decomposition is utilized to glue these local pieces together. Optimal sparse decomposition, consistency with different domain decomposition and robustness to small perturbation are proved under the so called regular-sparse assumption (see Definition 1.2). We provide simulation results to show the efficiency and robustness of the ISMD. We also compare the ISMD to other existing methods, e.g., eigen decomposition, pivoted Cholesky decomposition and convex relaxation of sparse principal component analysis [25] and [40].

APNov 30, 2016
A convergent method for linear half-space kinetic equations

Qin Li, Jianfeng Lu, Weiran Sun

We give a unified proof for the well-posedness of a class of linear half-space equations with general incoming data and construct a Galerkin method to numerically resolve this type of equations in a systematic way. Our main strategy in both analysis and numerics includes three steps: adding damping terms to the original half-space equation, using an inf-sup argument and even-odd decomposition to establish the well-posedness of the damped equation, and then recovering solutions to the original half-space equation. The proposed numerical methods for the damped equation is shown to be quasi-optimal and the numerical error of approximations to the original equation is controlled by that of the damped equation. This efficient solution to the half-space problem is useful for kinetic-fluid coupling simulations.

SROct 8, 2022
Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks

Haodi Jiang, Qin Li, Yan Xu et al.

Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.

NAFeb 5, 2015
Diffusion approximations and domain decomposition method of linear transport equations: asymptotics and numerics

Qin Li, Jianfeng Lu, Weiran Sun

In this paper we construct numerical schemes to approximate linear transport equations with slab geometry by diffusion equations. We treat both the case of pure diffusive scaling and the case where kinetic and diffusive scalings coexist. The diffusion equations and their data are derived from asymptotic and layer analysis which allows general scattering kernels and general data. We apply the half-space solver in [20] to resolve the boundary layer equation and obtain the boundary data for the diffusion equation. The algorithms are validated by numerical experiments and also by error analysis for the pure diffusive scaling case.

NAAug 13, 2012
Exponential Runge-Kutta schemes for inhomogeneous Boltzmann equations with high order of accuracy

Qin Li, Lorenzo Pareschi

We consider the development of exponential methods for the robust time discretization of space inhomogeneous Boltzmann equations in stiff regimes. Compared to the space homogeneous case, or more in general to the case of splitting based methods, studied in Dimarco Pareschi (SIAM J. Num. Anal. 2011) a major difficulty is that the local Maxwellian equilibrium state is not constant in a time step and thus needs a proper numerical treatment. We show how to derive asymptotic preserving (AP) schemes of arbitrary order and in particular using the Shu-Osher representation of Runge-Kutta methods we explore the monotonicity properties of such schemes, like strong stability preserving (SSP) and positivity preserving. Several numerical results confirm our analysis.

APDec 1, 2016
Validity and regularization of classical half-space equations

Qin Li, Jianfeng Lu, Weiran Sun

Recent result [Wu and Guo, Comm. Math. Phys., 2015] has shown that over the 2D unit disk, the classical half-space equation (CHS) for the neutron transport does not capture the correct boundary layer behaviour as long believed. In this paper we develop a regularization technique for CHS to any arbitrary order and use its first-order regularization to show that in the case of the 2D unit disk, although CHS misrepresents the boundary layer behaviour, it does give the correct boundary condition for the interior macroscopic (Laplace) equation. Therefore CHS is still a valid equation to recover the correct boundary condition for the interior Laplace equation over the 2D unit disk.

DSAug 21, 2023
Beyond expectations: Residual Dynamic Mode Decomposition and Variance for Stochastic Dynamical Systems

Matthew J. Colbrook, Qin Li, Ryan V. Raut et al.

Koopman operators linearize nonlinear dynamical systems, making their spectral information of crucial interest. Numerous algorithms have been developed to approximate these spectral properties, and Dynamic Mode Decomposition (DMD) stands out as the poster child of projection-based methods. Although the Koopman operator itself is linear, the fact that it acts in an infinite-dimensional space of observables poses challenges. These include spurious modes, essential spectra, and the verification of Koopman mode decompositions. While recent work has addressed these challenges for deterministic systems, there remains a notable gap in verified DMD methods for stochastic systems, where the Koopman operator measures the expectation of observables. We show that it is necessary to go beyond expectations to address these issues. By incorporating variance into the Koopman framework, we address these challenges. Through an additional DMD-type matrix, we approximate the sum of a squared residual and a variance term, each of which can be approximated individually using batched snapshot data. This allows verified computation of the spectral properties of stochastic Koopman operators, controlling the projection error. We also introduce the concept of variance-pseudospectra to gauge statistical coherency. Finally, we present a suite of convergence results for the spectral information of stochastic Koopman operators. Our study concludes with practical applications using both simulated and experimental data. In neural recordings from awake mice, we demonstrate how variance-pseudospectra can reveal physiologically significant information unavailable to standard expectation-based dynamical models.

NAApr 10, 2017
Uniform regularity for linear kinetic equations with random input based on hypocoercivity

Qin Li, Li Wang

In this paper we study the effect of randomness in kinetic equations that preserve mass. Our focus is in proving the analyticity of the solution with respect to the randomness, which naturally leads to the convergence of numerical methods. The analysis is carried out in a general setting, with the regularity result not depending on the specific form of the collision term, the probability distribution of the random variables, or the regime the system is in, and thereby termed "uniform". Applications include the linear Boltzmann equation, BGK model, Carlemann model, among many others; and the results hold true in kinetic, parabolic and high field regimes. The proof relies on the explicit expression of the high order derivatives of the solution in the random space, and the convergence in time is mainly based on hypocoercivity, which, despite the popularity in PDE analysis of kinetic theory, has rarely been used for numerical algorithms.

LGDec 12, 2022
Spatial-temporal traffic modeling with a fusion graph reconstructed by tensor decomposition

Qin Li, Xuan Yang, Yong Wang et al.

Accurate spatial-temporal traffic flow forecasting is essential for helping traffic managers to take control measures and drivers to choose the optimal travel routes. Recently, graph convolutional networks (GCNs) have been widely used in traffic flow prediction owing to their powerful ability to capture spatial-temporal dependencies. The design of the spatial-temporal graph adjacency matrix is a key to the success of GCNs, and it is still an open question. This paper proposes reconstructing the binary adjacency matrix via tensor decomposition, and a traffic flow forecasting method is proposed. First, we reformulate the spatial-temporal fusion graph adjacency matrix into a three-way adjacency tensor. Then, we reconstructed the adjacency tensor via Tucker decomposition, wherein more informative and global spatial-temporal dependencies are encoded. Finally, a Spatial-temporal Synchronous Graph Convolutional module for localized spatial-temporal correlations learning and a Dilated Convolution module for global correlations learning are assembled to aggregate and learn the comprehensive spatial-temporal dependencies of the road network. Experimental results on four open-access datasets demonstrate that the proposed model outperforms state-of-the-art approaches in terms of the prediction performance and computational cost.

SRNov 4, 2022
A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data

Haodi Jiang, Qin Li, Zhihang Hu et al.

Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle in which consistent time-sequence vector magnetograms have been available through the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look into another major instrument, namely the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data archive of SOHO/MDI covers more active solar cycle 23 with many large flares. However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a new deep learning method, named MagNet, to learn from combined LOS magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations collected by the Big Bear Solar Observatory (BBSO), and to generate vector components Bx' and By', which would form vector magnetograms with observed LOS data. In this way, we can expand the availability of vector magnetograms to the period from 1996 to present. Experimental results demonstrate the good performance of the proposed method. To our knowledge, this is the first time that deep learning has been used to generate photospheric vector magnetograms of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.

LGJun 3, 2023
Correcting Auto-Differentiation in Neural-ODE Training

Yewei Xu, Shi Chen, Qin Li

Does the use of auto-differentiation yield reasonable updates for deep neural networks (DNNs)? Specifically, when DNNs are designed to adhere to neural ODE architectures, can we trust the gradients provided by auto-differentiation? Through mathematical analysis and numerical evidence, we demonstrate that when neural networks employ high-order methods, such as Linear Multistep Methods (LMM) or Explicit Runge-Kutta Methods (ERK), to approximate the underlying ODE flows, brute-force auto-differentiation often introduces artificial oscillations in the gradients that prevent convergence. In the case of Leapfrog and 2-stage ERK, we propose simple post-processing techniques that effectively eliminates these oscillations, correct the gradient computation and thus returns the accurate updates.

NAFeb 1, 2016
Implicit Asymptotic Preserving Method for Linear Transport Equations

Qin Li, Li Wang

The computation of the radiative transfer equation is expensive mainly due to two stiff terms: the transport term and the collision operator. The stiffness in the former comes from the fact that particles (such as photons) travels at the speed of light, while that in the latter is due to the strong scattering in the diffusive regime. We study the fully implicit scheme for this equation to account for the stiffness. The main challenge in the implicit treatment is the coupling between the spacial and velocity coordinates that requires the large size of the to-be-inverted matrix, which is also ill-conditioned and not necessarily symmetric. Our main idea is to utilize the spectral structure of the ill-conditioned matrix to construct a pre-conditioner, which, along with an exquisite split of the spatial and angular dependence, significantly improve the condition number and allows matrix-free treatment. We also design a fast solver to compute this pre-conditioner explicitly in advance. Meanwhile, we reformulate the system via an even-odd parity, which results in a symmetric and positive definite matrix that can be inverted using conjugate gradient method. This idea can also be implemented to the original non-symmetric system whose inversion is solved by GMRES. A qualitative comparison with the conventional methods, including Krylov iterative method pre-conditioned with diffusive synthetic acceleration and asymptotic preserving scheme via even-odd decomposition, is also discussed.

NAJun 5, 2019
A Numerical method for coupling the BGK model and Euler equation through the linearized Knudsen layer

Hongxu Chen, Qin Li, Jianfeng Lu

The Bhatnagar-Gross-Krook (BGK) model, a simplification of the Boltzmann equation, in the absence of boundary effect, converges to the Euler equations when the Knudsen number is small. In practice, however, Knudsen layers emerge at the physical boundary, or at the interfaces between the two regimes. We model the Knudsen layer using a half-space kinetic equation, and apply a half-space numerical solver [ESAIM: M2AN 51 (2017) 1583-1615] [Math. Comp. 86 (2017), 1269-1301] to quantify the transition between the kinetic to the fluid regime. A full domain numerical solver is developed with a domain-decomposition approach, where we apply the Euler solver and kinetic solver on the appropriate subdomains and connect them via the half-space solver. In the nonlinear case, linearization is performed upon local Maxwellian. Despite the lack of analytical support, the numerical evidence nevertheless demonstrates that the linearization approach is promising.

NAOct 6, 2017
Stability of Stationary Inverse Transport Equation in Diffusion Scaling

Ke Chen, Qin Li, Li Wang

We consider the inverse problem of reconstructing the optical parameters for stationary radiative transfer equation (RTE) from velocity-averaged measurement. The RTE often contains multiple scales characterized by the magnitude of a dimensionless parameter---the Knudsen number ($K_n$). In the diffusive scaling ($K_n \ll 1$), the stationary RTE is well approximated by an elliptic equation in the forward setting. However, the inverse problem for the elliptic equation is acknowledged to be severely ill-posed as compared to the well-posedness of inverse transport equation, which raises the question of how uniqueness being lost as $K_n \rightarrow 0$. We tackle this problem by examining the stability of inverse problem with varying $K_n$. We show that, the discrepancy in two measurements is amplified in the reconstructed parameters at the order of $K_n^p~ (p = 1\text{ or} ~2)$, and as a result lead to ill-posedness in the zero limit of $K_n$. Our results apply to both continuous and discrete settings. Some numerical tests are performed in the end to validate these theoretical findings.

NAApr 25, 2017
An asymptotic preserving method for transport equations with oscillatory scattering coefficients

Qin Li, Jianfeng Lu

We design a numerical scheme for transport equations with oscillatory periodic scattering coefficients. The scheme is asymptotic preserving in the diffusion limit as Knudsen number goes to zero. It also captures the homogenization limit as the length scale of the scattering coefficient goes to zero. The proposed method is based on the construction of multiscale finite element basis and a Galerkin projection based on the even-odd decomposition. The method is analyzed in the asymptotic regime, as well as validated numerically.

29.2NAMay 17
MAGPIE: Multilevel-Adaptive-Guided Solver for Ptychographic Phase Retrieval

Borong Zhang, Qin Li, Zichao Wendy Di

We introduce MAGPIE (Multilevel-Adaptive-Guided Ptychographic Iterative Engine), a stochastic multigrid solver for the ptychographic phase-retrieval problem. The ptychographic phase-retrieval problem is inherently nonconvex and ill-posed. To address these challenges, we reformulate the original nonlinear and nonconvex inverse problem as the iterative minimization of a quadratic surrogate model that majorizes the original objective. This surrogate not only ensures favorable convergence properties but also generalizes the Ptychographic Iterative Engine (PIE) family of algorithms. By solving the surrogate model using a multigrid method, MAGPIE achieves substantial gains in convergence speed and reconstruction quality over traditional approaches.

NAMay 27, 2018
Explicit Finite Element Error Estimates for Nonhomogeneous Neumann problems

Qin Li, Xuefeng Liu

The paper develops an explicit a priori error estimate for finite element solution to nonhomogeneous Neumann problems. For this purpose, the hypercircle equation over finite element spaces is constructed and the explicit upper bound of the constant in the trace theorem is given. Numerical examples are shown in the final section, which implies the proposed error {estimate} has the convergence rate as $0.5$.

27.0DSMay 23
Finding Koopman Invariant Subspaces via Personalized PageRank

Hyukpyo Hong, Qin Li, Matthew J. Colbrook et al.

Selecting a finite dictionary of observables whose span is Koopman-invariant is a central challenge in data-driven Koopman operator approximation. We address this problem by exploiting zero-block structure in Extended Dynamic Mode Decomposition (EDMD) matrices. We show that any sub-dictionary whose span is Koopman-invariant induces an exact zero block in the EDMD matrix, even for finite data. We then show that such blocks can be detected by applying PageRank to a row-normalized EDMD matrix constructed from a large initial dictionary. The theory extends to approximately invariant subspaces and yields stronger guarantees for personalized PageRank (PPR) when the seed observables lie inside the target block and reach all observables in that block. Combining EDMD concentration bounds with PageRank perturbation theory gives end-to-end detection guarantees with $O(1/\sqrt{M})$ finite-sample scaling and explicit constants. More generally, without assuming an invariant subspace exists, high PPR mass on a sub-dictionary controls discounted multi-step leakage from the seed observables. Numerical experiments on the Duffing oscillator, Van der Pol oscillator, Lorenz system, and a three-well Ramachandran potential suggest that the method identifies compact, interpretable dictionaries with accurate predictions.

CVMar 3
BRIGHT: A Collaborative Generalist-Specialist Foundation Model for Breast Pathology

Xiaojing Guo, Jiatai Lin, Yumian Jia et al.

Generalist pathology foundation models (PFMs), pretrained on large-scale multi-organ datasets, have demonstrated remarkable predictive capabilities across diverse clinical applications. However, their proficiency on the full spectrum of clinically essential tasks within a specific organ system remains an open question due to the lack of large-scale validation cohorts for a single organ as well as the absence of a tailored training paradigm that can effectively translate broad histomorphological knowledge into the organ-specific expertise required for specialist-level interpretation. In this study, we propose BRIGHT, the first PFM specifically designed for breast pathology, trained on approximately 210 million histopathology tiles from over 51,000 breast whole-slide images derived from a cohort of over 40,000 patients across 19 hospitals. BRIGHT employs a collaborative generalist-specialist framework to capture both universal and organ-specific features. To comprehensively evaluate the performance of PFMs on breast oncology, we curate the largest multi-institutional cohorts to date for downstream task development and evaluation, comprising over 25,000 WSIs across 10 hospitals. The validation cohorts cover the full spectrum of breast pathology across 24 distinct clinical tasks spanning diagnosis, biomarker prediction, treatment response and survival prediction. Extensive experiments demonstrate that BRIGHT outperforms three leading generalist PFMs, achieving state-of-the-art (SOTA) performance in 21 of 24 internal validation tasks and in 5 of 10 external validation tasks with excellent heatmap interpretability. By evaluating on large-scale validation cohorts, this study not only demonstrates BRIGHT's clinical utility in breast oncology but also validates a collaborative generalist-specialist paradigm, providing a scalable template for developing PFMs on a specific organ system.

NADec 12, 2022
Solving the Wide-band Inverse Scattering Problem via Equivariant Neural Networks

Borong Zhang, Leonardo Zepeda-Núñez, Qin Li

This paper introduces a novel deep neural network architecture for solving the inverse scattering problem in frequency domain with wide-band data, by directly approximating the inverse map, thus avoiding the expensive optimization loop of classical methods. The architecture is motivated by the filtered back-projection formula in the full aperture regime and with homogeneous background, and it leverages the underlying equivariance of the problem and compressibility of the integral operator. This drastically reduces the number of training parameters, and therefore the computational and sample complexity of the method. In particular, we obtain an architecture whose number of parameters scale sub-linearly with respect to the dimension of the inputs, while its inference complexity scales super-linearly but with very small constants. We provide several numerical tests that show that the current approach results in better reconstruction than optimization-based techniques such as full-waveform inversion, but at a fraction of the cost while being competitive with state-of-the-art machine learning methods.

LGAug 5, 2024
Back-Projection Diffusion: Solving the Wideband Inverse Scattering Problem with Diffusion Models

Borong Zhang, Martín Guerra, Qin Li et al.

We present Wideband Back-Projection Diffusion, an end-to-end probabilistic framework for approximating the posterior distribution induced by the inverse scattering map from wideband scattering data. This framework produces highly accurate reconstructions, leveraging conditional diffusion models to draw samples, and also honors the symmetries of the underlying physics of wave-propagation. The procedure is factored into two steps: the first step, inspired by the filtered back-propagation formula, transforms data into a physics-based latent representation, while the second step learns a conditional score function conditioned on this latent representation. These two steps individually obey their associated symmetries and are amenable to compression by imposing the rank structure found in the filtered back-projection formula. Empirically, our framework has both low sample and computational complexity, with its number of parameters scaling only sub-linearly with the target resolution, and has stable training dynamics. It provides sharp reconstructions effortlessly and is capable of recovering even sub-Nyquist features in the multiple-scattering regime.

NAFeb 27, 2019
Diffusive optical tomography in the Bayesian framework

Kit Newton, Qin Li, Andrew Stuart

Many naturally-occuring models in the sciences are well-approximated by simplified models, using multiscale techniques. In such settings it is natural to ask about the relationship between inverse problems defined by the original problem and by the multiscale approximation. We develop an approach to this problem and exemplify it in the context of optical tomographic imaging. Optical tomographic imaging is a technique for infering the properties of biological tissue via measurements of the incoming and outgoing light intensity; it may be used as a medical imaging methodology. Mathematically, light propagation is modeled by the radiative transfer equation (RTE), and optical tomography amounts to reconstructing the scattering and the absorption coefficients in the RTE from boundary measurements. We study this problem in the Bayesian framework, focussing on the strong scattering regime. In this regime the forward RTE is close to the diffusion equation (DE). We study the RTE in the asymptotic regime where the forward problem approaches the DE, and prove convergence of the inverse RTE to the inverse DE in both nonlinear and linear settings. Convergence is proved by studying the distance between the two posterior distributions using the Hellinger metric, and using Kullback-Leibler divergence.

CLSep 27, 2023
Integrating LLM, EEG, and Eye-Tracking Biomarker Analysis for Word-Level Neural State Classification in Semantic Inference Reading Comprehension

Yuhong Zhang, Qin Li, Sujal Nahata et al.

With the recent proliferation of large language models (LLMs), such as Generative Pre-trained Transformers (GPT), there has been a significant shift in exploring human and machine comprehension of semantic language meaning. This shift calls for interdisciplinary research that bridges cognitive science and natural language processing (NLP). This pilot study aims to provide insights into individuals' neural states during a semantic relation reading-comprehension task. We propose jointly analyzing LLMs, eye-gaze, and electroencephalographic (EEG) data to study how the brain processes words with varying degrees of relevance to a keyword during reading. We also use a feature engineering approach to improve the fixation-related EEG data classification while participants read words with high versus low relevance to the keyword. The best validation accuracy in this word-level classification is over 60\% across 12 subjects. Words of high relevance to the inference keyword had significantly more eye fixations per word: 1.0584 compared to 0.6576 when excluding no-fixation words, and 1.5126 compared to 1.4026 when including them. This study represents the first attempt to classify brain states at a word level using LLM knowledge. It provides valuable insights into human cognitive abilities and the realm of Artificial General Intelligence (AGI), and offers guidance for developing potential reading-assisted technologies.

OCOct 6, 2023
Accelerating optimization over the space of probability measures

Shi Chen, Qin Li, Oliver Tse et al.

The acceleration of gradient-based optimization methods is a subject of significant practical and theoretical importance, particularly within machine learning applications. While much attention has been directed towards optimizing within Euclidean space, the need to optimize over spaces of probability measures in machine learning motivates exploration of accelerated gradient methods in this context too. To this end, we introduce a Hamiltonian-flow approach analogous to momentum-based approaches in Euclidean space. We demonstrate that, in the continuous-time setting, algorithms based on this approach can achieve convergence rates of arbitrarily high order. We complement our findings with numerical examples.

CLJul 26, 2024
ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema

Fei Wang, Yuewen Zheng, Qin Li et al.

Objective: This study introduces ChatSchema, an effective method for extracting and structuring information from unstructured data in medical paper reports using a combination of Large Multimodal Models (LMMs) and Optical Character Recognition (OCR) based on the schema. By integrating predefined schema, we intend to enable LMMs to directly extract and standardize information according to the schema specifications, facilitating further data entry. Method: Our approach involves a two-stage process, including classification and extraction for categorizing report scenarios and structuring information. We established and annotated a dataset to verify the effectiveness of ChatSchema, and evaluated key extraction using precision, recall, F1-score, and accuracy metrics. Based on key extraction, we further assessed value extraction. We conducted ablation studies on two LMMs to illustrate the improvement of structured information extraction with different input modals and methods. Result: We analyzed 100 medical reports from Peking University First Hospital and established a ground truth dataset with 2,945 key-value pairs. We evaluated ChatSchema using GPT-4o and Gemini 1.5 Pro and found a higher overall performance of GPT-4o. The results are as follows: For the result of key extraction, key-precision was 98.6%, key-recall was 98.5%, key-F1-score was 98.6%. For the result of value extraction based on correct key extraction, the overall accuracy was 97.2%, precision was 95.8%, recall was 95.8%, and F1-score was 95.8%. An ablation study demonstrated that ChatSchema achieved significantly higher overall accuracy and overall F1-score of key-value extraction, compared to the Baseline, with increases of 26.9% overall accuracy and 27.4% overall F1-score, respectively.

CVMar 8, 2024Code
Beyond MOT: Semantic Multi-Object Tracking

Yunhao Li, Qin Li, Hao Wang et al.

Current multi-object tracking (MOT) aims to predict trajectories of targets (i.e., ''where'') in videos. Yet, knowing merely ''where'' is insufficient in many crucial applications. In comparison, semantic understanding such as fine-grained behaviors, interactions, and overall summarized captions (i.e., ''what'') from videos, associated with ''where'', is highly-desired for comprehensive video analysis. Thus motivated, we introduce Semantic Multi-Object Tracking (SMOT), that aims to estimate object trajectories and meanwhile understand semantic details of associated trajectories including instance captions, instance interactions, and overall video captions, integrating ''where'' and ''what'' for tracking. In order to foster the exploration of SMOT, we propose BenSMOT, a large-scale Benchmark for Semantic MOT. Specifically, BenSMOT comprises 3,292 videos with 151K frames, covering various scenarios for semantic tracking of humans. BenSMOT provides annotations for the trajectories of targets, along with associated instance captions in natural language, instance interactions, and overall caption for each video sequence. To our best knowledge, BenSMOT is the first publicly available benchmark for SMOT. Besides, to encourage future research, we present a novel tracker named SMOTer, which is specially designed and end-to-end trained for SMOT, showing promising performance. By releasing BenSMOT, we expect to go beyond conventional MOT by predicting ''where'' and ''what'' for SMOT, opening up a new direction in tracking for video understanding. We will release BenSMOT and SMOTer at https://github.com/Nathan-Li123/SMOTer.

MLSep 30, 2024
Stochastic Inverse Problem: stability, regularization and Wasserstein gradient flow

Qin Li, Maria Oprea, Li Wang et al.

Inverse problems in physical or biological sciences often involve recovering an unknown parameter that is random. The sought-after quantity is a probability distribution of the unknown parameter, that produces data that aligns with measurements. Consequently, these problems are naturally framed as stochastic inverse problems. In this paper, we explore three aspects of this problem: direct inversion, variational formulation with regularization, and optimization via gradient flows, drawing parallels with deterministic inverse problems. A key difference from the deterministic case is the space in which we operate. Here, we work within probability space rather than Euclidean or Sobolev spaces, making tools from measure transport theory necessary for the study. Our findings reveal that the choice of metric -- both in the design of the loss function and in the optimization process -- significantly impacts the stability and properties of the optimizer.

36.1SEMay 17
Event-B Agent: Towards LLM Agent for Formal Model Synthesis and Repair

Hongshu Wang, Xinyue Zuo, Yuhan Sun et al.

Building software that is correct by construction is a long-standing goal in software engineering, as it ensures reliability during design and development rather than after deployment. Formal methods realize this vision by enabling the expression of system behavior and requirements in mathematics, thereby guaranteeing correctness through formal verification, including theorem proving and model checking. However, the steep learning curve and demand for mathematical expertise hinder the widespread adoption of formal methods. Large language models (LLMs) have recently shown promise in bridging this gap through autoformalization. However, existing LLM-based approaches are largely limited to isolated tasks, such as theorem proving without formalization or model synthesis with insufficient verification. While valuable, these efforts do not fully exploit the potential of a more comprehensive framework in which models and proofs evolve together, a process that closely reflects real-world development practice. To address this gap, we propose Event-B Agent, a novel framework inspired by the interleaved nature of software design. Given natural language requirements, Event-B Agent constructs an initial model and iteratively repairs and refines it using formal verification feedback. Refinement simplifies proof discharge, while repair of models and proofs ensures the soundness of each refinement step. Together, these two components reinforce each other to progressively improve the model quality. Evaluation across systems of varying complexity demonstrates that Event-B Agent substantially outperforms baselines in end-to-end formal model synthesis and repair, while maintaining reasonable efficiency. These results suggest that Event-B Agent is a promising step toward correct-by-construction formal model synthesis and repair.

SRMay 21, 2024Code
Global-local Fourier Neural Operator for Accelerating Coronal Magnetic Field Model

Yutao Du, Qin Li, Raghav Gnanasambandam et al.

Exploring the outer atmosphere of the sun has remained a significant bottleneck in astrophysics, given the intricate magnetic formations that significantly influence diverse solar events. Magnetohydrodynamics (MHD) simulations allow us to model the complex interactions between the sun's plasma, magnetic fields, and the surrounding environment. However, MHD simulation is extremely time-consuming, taking days or weeks for simulation. The goal of this study is to accelerate coronal magnetic field simulation using deep learning, specifically, the Fourier Neural Operator (FNO). FNO has been proven to be an ideal tool for scientific computing and discovery in the literature. In this paper, we proposed a global-local Fourier Neural Operator (GL-FNO) that contains two branches of FNOs: the global FNO branch takes downsampled input to reconstruct global features while the local FNO branch takes original resolution input to capture fine details. The performance of the GLFNO is compared with state-of-the-art deep learning methods, including FNO, U-NO, U-FNO, Vision Transformer, CNN-RNN, and CNN-LSTM, to demonstrate its accuracy, computational efficiency, and scalability. Furthermore, physics analysis from domain experts is also performed to demonstrate the reliability of GL-FNO. The results demonstrate that GL-FNO not only accelerates the MHD simulation (a few seconds for prediction, more than \times 20,000 speed up) but also provides reliable prediction capabilities, thus greatly contributing to the understanding of space weather dynamics. Our code implementation is available at https://github.com/Yutao-0718/GL-FNO

QUANT-PHAug 25, 2024
Verifiable cloud-based variational quantum algorithms

Junhong Yang, Banghai Wang, Junyu Quan et al.

Variational quantum algorithms (VQAs) have shown potential for quantum advantage with noisy intermediate-scale quantum (NISQ) devices for quantum machine learning (QML). However, given the high cost and limited availability of quantum resources, delegating VQAs via cloud networks is a more practical solution for clients with limited quantum capabilities. Recently, Shingu et al.[Physical Review A, 105, 022603 (2022)] proposed a variational secure cloud quantum computing protocol, utilizing ancilla-driven quantum computation (ADQC) for cloud-based VQAs with minimal quantum resource consumption. However, their protocol lacks verifiability, which exposes it to potential malicious behaviors by the server. Additionally, channel loss requires frequent re-delegation as the size of the delegated variational circuit grows, complicating verification due to increased circuit complexity. This paper introduces a new protocol to address these challenges and enhance both verifiability and tolerance to channel loss in cloud-based VQAs.

SOC-PHFeb 26
Supervised tax compliance and evasion from a spatial evolutionary game perspective

Qin Li, Ting Ling, Minyu Feng et al.

Taxation constitutes a fundamental component of modern national economic systems, exerting profound impacts on both societal functioning and governmental operations. In this paper, we employ an interdependent network approach to model the coevolution between citizens and regulators within a taxation system that fundamentally constitutes a public goods game framework with complex interactive dynamics. In a game layer, citizens engage in public goods games, facing the social dilemma of tax compliance (cooperation) versus evasion (defection). Tax compliance supports the sustainability of public finances while tax evasion presents markedly stronger short-term incentives. In a regulatory layer, fair regulators punish tax evaders, while corrupt regulators keep silent due to bribes. Governmental regulatory interventions introduce critical institutional constraints that alter the traditional equilibrium of the game. Importantly, there exists a strategy update not only among citizens but also among regulators. Our results indicate that strengthening penalties can effectively curb tax evasion, and the influence of bribery on both tax compliance rates and the proportion of fair regulators is nonlinear. Additionally, increasing regulators' salaries and intensifying the crackdown on corrupt regulators can foster the emergence of fair regulators, thereby reducing tax evasion among citizens. The results offer practical policy implications, suggesting that balanced deterrence and institutional fairness are essential to sustaining compliance, and point to the need for future empirical validation and model extensions.

24.3NAApr 11
Sensitivity-preserving of Fisher Information Matrix through random data down-sampling for experimental design

Kathrin Hellmuth, Christian Klingenberg, Qin Li

The quality of numerical reconstructions for unknown parameters in inverse problems depends fundamentally on the selection of experimental data. To ensure a robust reconstruction, it is crucial to select data that are sensitive to the parameters, a property typically characterized by the conditioning of the Fisher Information Matrix (FIM). In this work, we propose a general framework for an efficient down-sampling strategy that selects experimental setups that preserves the information content of the full-data FIM. Our approach leverages matrix sketching techniques from randomized numerical linear algebra to achieve a sensitivity-preserving approximation. The method involves drawing samples from a sensitivity-informed distribution, which we execute using gradient-free ensemble sampling methods to handle potentially non-smooth or discrete design spaces. Numerical experiments demonstrate the effectiveness of this framework in selecting optimal sensor locations for a Schroedinger potential reconstruction problem.

CVSep 7, 2024
Deep Computer Vision for Solar Physics Big Data: Opportunities and Challenges

Bo Shen, Marco Marena, Chenyang Li et al.

With recent missions such as advanced space-based observatories like the Solar Dynamics Observatory (SDO) and Parker Solar Probe, and ground-based telescopes like the Daniel K. Inouye Solar Telescope (DKIST), the volume, velocity, and variety of data have made solar physics enter a transformative era as solar physics big data (SPBD). With the recent advancement of deep computer vision, there are new opportunities in SPBD for tackling problems that were previously unsolvable. However, there are new challenges arising due to the inherent characteristics of SPBD and deep computer vision models. This vision paper presents an overview of the different types of SPBD, explores new opportunities in applying deep computer vision to SPBD, highlights the unique challenges, and outlines several potential future research directions.

60.3ROMar 12
Safe and Stylized Trajectory Planning for Autonomous Driving via Diffusion Model

Shuo Pei, Yong Wang, Yuanchen Zhu et al.

Achieving safe and stylized trajectory planning in complex real-world scenarios remains a critical challenge for autonomous driving systems. This paper proposes the SDD Planner, a diffusion-based framework designed to effectively reconcile safety constraints with driving styles in real time. The framework integrates two core modules: a Multi-Source Style-Aware Encoder, which employs distance-sensitive attention to fuse dynamic agent data and environmental contexts for heterogeneous safety-style perception; and a Style-Guided Dynamic Trajectory Generator, which adaptively modulates priority weights within the diffusion denoising process to generate user-preferred yet safe trajectories. Extensive experiments demonstrate that SDD Planner achieves state-of-the-art performance. On the StyleDrive benchmark, it improves the SM-PDMS metric by 3.9% over WoTE, the strongest baseline. Furthermore, on the NuPlan Test14 and Test14-hard benchmarks, SDD Planner ranks first with overall scores of 91.76 and 80.32, respectively, outperforming leading methods such as PLUTO. Real-vehicle closed-loop tests further confirm that SDD Planner maintains high safety standards while aligning with preset driving styles, validating its practical applicability for real-world deployment.

AINov 21, 2023
IEKM: A Model Incorporating External Keyword Matrices

Cheng Luo, Qin Li, Zhao Yan et al.

A customer service platform system with a core text semantic similarity (STS) task faces two urgent challenges: Firstly, one platform system needs to adapt to different domains of customers, i.e., different domains adaptation (DDA). Secondly, it is difficult for the model of the platform system to distinguish sentence pairs that are literally close but semantically different, i.e., hard negative samples. In this paper, we propose an incorporation external keywords matrices model (IEKM) to address these challenges. The model uses external tools or dictionaries to construct external matrices and fuses them to the self-attention layers of the Transformer structure through gating units, thus enabling flexible corrections to the model results. We evaluate the method on multiple datasets and the results show that our method has improved performance on all datasets. To demonstrate that our method can effectively solve all the above challenges, we conduct a flexible correction experiment, which results in an increase in the F1 value from 56.61 to 73.53. Our code will be publicly available.

78.2NAApr 14
What metric to optimize for suppressing instability in a Vlasov-Poisson system?

Martin Guerra, Qin Li, Yukun Yue et al.

Stabilizing plasma dynamics is a central challenge in magnetic confinement fusion. A common approach is to introduce external electric fields to suppress instabilities in the plasma distribution. However, efficiently identifying such stabilizing fields remains challenging, even for simplified kinetic models such as the Vlasov-Poisson (VP) system. In this work we study plasma stabilization from the perspective of PDE-constrained optimization. Our goal is to understand how the choice of objective function and the underlying kinetic dynamics influence the optimization landscape. First, we analyze the dispersion relation of the VP system and show that it reveals the spectral structure of the dynamics; eliminating unstable modes provides parameter configurations that lie close to the global optimum and serve as effective initial guesses for optimization. Second, we investigate several objective functions for stabilization and compare their optimization landscapes through numerical experiments. Our results show that while different objectives lead to similar stabilizing parameter configurations, objective functions incorporating time-integrated information exhibit more convex-like landscapes and are therefore more favorable for gradient-based optimization methods. These findings provide insight into the design of objective functions for optimization-based plasma control and suggest promising directions for future research on real-time stabilization of kinetic plasma models.

CLOct 15, 2025Code
D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree

Xiang Lei, Qin Li, Min Zhang et al.

Large Language Models (LLMs) often exhibit factual inconsistencies and logical decay in extended, multi-turn dialogues, a challenge stemming from their reliance on static, pre-trained knowledge and an inability to reason adaptively over the dialogue history. Prevailing mitigation strategies, such as Retrieval-Augmented Generation (RAG) and agentic working memories, improve information recall but still engage with fundamentally static knowledge sources and follow pre-defined single reasoning path. This hinders their ability to preserve factual and logical consistency of their responses in multi-turn dialogues while the context evolves over time. To address this issue, we propose D-SMART, a model-agnostic framework designed to maintain multi-turn dialogue consistency by enabling LLMs to build and reason over a dynamic, structured representation of the conversational context. This is achieved via two synergistic components: (1) a Dynamic Structured Memory (DSM), which incrementally constructs and maintains an authoritative, OWL-compliant knowledge graph of the conversation; and (2) a Reasoning Tree (RT), which executes inferences as an explicit and traceable multi-step search over the graph. As the popular-used quality score (judged by GPT-4) can overlook logical flaws, we introduce new NLI-based metrics to better measure multi-turn dialogue consistency. Comprehensive experiments on the MT-Bench-101 benchmark show that D-SMART significantly outperforms state-of-the-art baselines, elevating the dialogue consistency score by over 48\% for both proprietary and open-source models, and notably improves the quality score of the latter by up to 10.1\%.

LGOct 6, 2025Code
Physics-informed Attention-enhanced Fourier Neural Operator for Solar Magnetic Field Extrapolations

Jinghao Cao, Qin Li, Mengnan Du et al.

We propose Physics-informed Attention-enhanced Fourier Neural Operator (PIANO) to solve the Nonlinear Force-Free Field (NLFFF) problem in solar physics. Unlike conventional approaches that rely on iterative numerical methods, our proposed PIANO directly learns the 3D magnetic field structure from 2D boundary conditions. Specifically, PIANO integrates Efficient Channel Attention (ECA) mechanisms with Dilated Convolutions (DC), which enhances the model's ability to capture multimodal input by prioritizing critical channels relevant to the magnetic field's variations. Furthermore, we apply physics-informed loss by enforcing the force-free and divergence-free conditions in the training process so that our prediction is consistent with underlying physics with high accuracy. Experimental results on the ISEE NLFFF dataset show that our PIANO not only outperforms state-of-the-art neural operators in terms of accuracy but also shows strong consistency with the physical characteristics of NLFFF data across magnetic fields reconstructed from various solar active regions. The GitHub of this project is available https://github.com/Autumnstar-cjh/PIANO

CVOct 2, 2025Code
Consistent Assistant Domains Transformer for Source-free Domain Adaptation

Renrong Shao, Wei Zhang, Kangyang Luo et al.

Source-free domain adaptation (SFDA) aims to address the challenge of adapting to a target domain without accessing the source domain directly. However, due to the inaccessibility of source domain data, deterministic invariable features cannot be obtained. Current mainstream methods primarily focus on evaluating invariant features in the target domain that closely resemble those in the source domain, subsequently aligning the target domain with the source domain. However, these methods are susceptible to hard samples and influenced by domain bias. In this paper, we propose a Consistent Assistant Domains Transformer for SFDA, abbreviated as CADTrans, which solves the issue by constructing invariable feature representations of domain consistency. Concretely, we develop an assistant domain module for CADTrans to obtain diversified representations from the intermediate aggregated global attentions, which addresses the limitation of existing methods in adequately representing diversity. Based on assistant and target domains, invariable feature representations are obtained by multiple consistent strategies, which can be used to distinguish easy and hard samples. Finally, to align the hard samples to the corresponding easy samples, we construct a conditional multi-kernel max mean discrepancy (CMK-MMD) strategy to distinguish between samples of the same category and those of different categories. Extensive experiments are conducted on various benchmarks such as Office-31, Office-Home, VISDA-C, and DomainNet-126, proving the significant performance improvements achieved by our proposed approaches. Code is available at https://github.com/RoryShao/CADTrans.git.

26.5LGMay 6
Provable imitation learning for control of instability in partially-observed Vlasov--Poisson equations

Xiaofan Xia, Qin Li, Wenlong Mou

We consider the stabilization of Vlasov--Poisson plasma dynamics, a central control problem in nuclear fusion. Our focus is the gap between what an ideal controller would use and what experiments can actually observe: while optimal policy may rely on the full phase-space state, practical feedback is typically limited to sparse macroscopic diagnostics. We therefore study imitation learning methods that distill a fully observed expert policy into controllers operating only on macroscopic measurements. We show the stability guarantees of the learned policy, where the error floor depends on the minimal behavior cloning loss achievable under the observation constraints. We further characterize this minimal loss in terms of a notion of entropy that quantifies the complexity of the initial distribution. Our results demonstrates the theoretical feasibility of learning stabilizing feedback policies for kinetic plasma dynamics from macroscopic observations, and exhibits the adaptivity of the learning approach to low-complexity structures. Through extensive numerical experiments, we validate our theory and show that the learned policies can stabilize the system using only macroscopic observations, within a significantly longer time horizon than non-adaptive baseline controllers.

21.6NAMar 15
Inference of interacting kernel in the mean-field regime

Peiyi Chen, Qin Li, Li Wang et al.

We study the problem of reconstructing interaction kernels in systems of interacting agents from macroscopic measurements when posed as an optimization problem. The reconstruction procedure depends on the formulation of the forward model, which may be given either by a finite-dimensional coupled ODE system tracking individual agent trajectories or by a mean-field PDE describing the evolution of the agent density. We investigate the similarities and differences between these two formulations in the mean-field regime. While the first variation derived from the particle system does not provide an unbiased estimator of the first variation associated with the limiting PDE, we prove that, under mild assumptions, the two are close in a weak sense with a convergence rate $\mathcal{O}(N^{-1/2})$. This rate is further confirmed by numerical evidences.

LGJan 10, 2024
A Good Score Does not Lead to A Good Generative Model

Sixu Li, Shi Chen, Qin Li

Score-based Generative Models (SGMs) is one leading method in generative modeling, renowned for their ability to generate high-quality samples from complex, high-dimensional data distributions. The method enjoys empirical success and is supported by rigorous theoretical convergence properties. In particular, it has been shown that SGMs can generate samples from a distribution that is close to the ground-truth if the underlying score function is learned well, suggesting the success of SGM as a generative model. We provide a counter-example in this paper. Through the sample complexity argument, we provide one specific setting where the score function is learned well. Yet, SGMs in this setting can only output samples that are Gaussian blurring of training data points, mimicking the effects of kernel density estimation. The finding resonates a series of recent finding that reveal that SGMs can demonstrate strong memorization effect and fail to generate.

92.6CVMar 25
Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification

Han Sun, Qin Li, Peixin Wang et al.

Object hallucination in Large Vision-Language Models (LVLMs) severely compromises their reliability in real-world applications, posing a critical barrier to their deployment in high-stakes scenarios such as autonomous driving and medical image analysis. Through systematic empirical investigation, we identify that the imbalanced attention allocation, both across modalities (i.e., vision and language) and within modalities (among individual tokens), exhibits a strong causal correlation with the occurrence of object hallucination. Leveraging this insight, we introduce a novel concept termed attention imbalance, which not only quantifies the degree of attention disparity but also visually delineates the underlying patterns (e.g., over-attentiveness to irrelevant language tokens or under-attentiveness to discriminative visual features) that drive object hallucination. To mitigate object hallucination, we further propose Attention Imbalance Rectification (AIR), a lightweight decoding-time intervention method that reallocates attention weights and adjusts attention distributions to rectify modality-wise and token-wise imbalances. Extensive evaluations on four mainstream LVLMs and three benchmarks (CHAIR, POPE, and MM-Vet) with seven baselines demonstrate that AIR consistently reduces object hallucination rates, achieving up to a 35.1% reduction compared to the baselines, while improving up to 15.9% of LVLMs' general capability across diverse vision-language tasks.

57.4PLASM-PHMar 17
Control of a Uniformly Magnetized Plasma with External Electric Fields

Peiyi Chen, Rogerio Jorge, Qin Li et al.

Stabilizing plasma dynamics through externally applied electric and magnetic fields is a fundamental control problem. We study this question for a plasma evolving under a uniform external magnetic field. Although the governing dynamics are nonlinear, a linear analysis based on the Laplace-Fourier transform yields actionable insight. In particular, by controlling the location of the roots of the dispersion relation, we propose a general control strategy that restores stability, with the free-streaming solution recovered as a special case. Numerical experiments for Gaussian equilibria and for the Dory-Guest-Harris instability show that the proposed control suppresses the unstable modes and stabilizes the dynamics, in agreement with our theoretical predictions.

CVJul 23, 2025
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

Yiwen Chen, Zhihao Li, Yikai Wang et al.

Recent advances in sparse voxel representations have significantly improved the quality of 3D content generation, enabling high-resolution modeling with fine-grained geometry. However, existing frameworks suffer from severe computational inefficiencies due to the quadratic complexity of attention mechanisms in their two-stage diffusion pipelines. In this work, we propose Ultra3D, an efficient 3D generation framework that significantly accelerates sparse voxel modeling without compromising quality. Our method leverages the compact VecSet representation to efficiently generate a coarse object layout in the first stage, reducing token count and accelerating voxel coordinate prediction. To refine per-voxel latent features in the second stage, we introduce Part Attention, a geometry-aware localized attention mechanism that restricts attention computation within semantically consistent part regions. This design preserves structural continuity while avoiding unnecessary global attention, achieving up to 6.7x speed-up in latent generation. To support this mechanism, we construct a scalable part annotation pipeline that converts raw meshes into part-labeled sparse voxels. Extensive experiments demonstrate that Ultra3D supports high-resolution 3D generation at 1024 resolution and achieves state-of-the-art performance in both visual fidelity and user preference.