Jiwei Zhang

CV
h-index4
20papers
55citations
Novelty48%
AI Score53

20 Papers

AIMay 28
MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs

Kevin Wang, Anna Thöni, Benjamin Kempinski et al.

Large language models (LLMs) are increasingly deployed as interactive agents, yet their capacity for social and strategic reasoning over extended interaction remains poorly understood. Existing evaluations rely on static vignettes or single-game benchmarks that cannot capture the sustained, multi-faceted reasoning that real-world multi-agent settings demand. We introduce Mindgames, a multi-game arena and evaluation platform for LLM agents that operationalizes complementary reasoning demands relevant to ``theory of mind'': belief attribution under hidden information, opponent modeling through repeated strategic interaction, cooperative inference under knowledge asymmetries, and sustained deception in social deduction. Built on TextArena, Mindgames provides a unified interaction interface, TrueSkill-based rating, and full trajectory logging across four game environments. We instantiate Mindgames through a 2025 competition cycle hosted at a major AI conference, which assessed 944 submitted agents from 76 teams across four games: Colonel Blotto, Iterated Prisoner's Dilemma, Codenames, and Secret Mafia. Our analysis surfaces both agent-level and evaluation-level limitations: brittle rule adherence remains a major bottleneck, top-performing systems repeatedly rely on explicit structural scaffolding, and leaderboard validity differs sharply across environments. In particular, failure-heavy environments can reward robustness to opponent errors as much as strategic ability, with Secret Mafia exhibiting a pronounced error-survival confound in this cycle. We release a dataset of 29,571 multi-agent games with turn-level observations, actions, and rewards, together with MG-Ref, a deterministic offline tournament protocol that scores new agents against a frozen reference pool of top-ranked, low-error Stage~II submissions under the same error-attribution lens used in this analysis.

NANov 11, 2015
Fast evaluation of the Caputo fractional derivative and its applications to fractional diffusion equations

Shidong Jiang, Jiwei Zhang, Qian Zhang et al.

We present an efficient algorithm for the evaluation of the Caputo fractional derivative $_0^C\!D_t^αf(t)$ of order $α\in (0,1)$, which can be expressed as a convolution of $f'(t)$ with the kernel $t^{-α}$. The algorithm is based on an efficient sum-of-exponentials approximation for the kernel $t^{-1-α}$ on the interval $[Δt, T]$ with a uniform absolute error $\varepsilon$, where the number of exponentials $N_{\text{exp}}$ needed is of the order $O\left(\log\frac{1}{\varepsilon}\left( \log\log\frac{1}{\varepsilon}+\log\frac{T}{Δt}\right) +\log\frac{1}{Δt}\left( \log\log\frac{1}{\varepsilon}+\log\frac{1}{Δt}\right) \right)$. As compared with the direct method, the resulting algorithm reduces the storage requirement from $O(N_T)$ to $O(N_{\text{exp}})$ and the overall computational cost from $O(N_T^2)$ to $O(N_TN_{\text{exp}})$ with $N_T$ the total number of time steps. Furthermore, when the fast evaluation scheme of the Caputo derivative is applied to solve the fractional diffusion equations, the resulting algorithm requires only $O(N_SN_{\text{exp}})$ storage and $O(N_SN_TN_{\text{exp}})$ work with $N_S$ the total number of points in space; whereas the direct methods require $O(N_SN_T$) storage and $O(N_SN_T^2)$ work. The complexity of both algorithms is nearly optimal since $N_{\text{exp}}$ is of the order $O(\log N_T)$ for $T\gg 1$ or $O(\log^2N_T)$ for $T\approx 1$ for fixed accuracy $\varepsilon$. We also present a detailed stability and error analysis of the new scheme for solving linear fractional diffusion equations. The performance of the new algorithm is illustrated via several numerical examples. Finally, the algorithm can be parallelized in a straightforward manner.

NANov 22, 2018
A discrete Grönwall inequality with application to numerical schemes for subdiffusion problems

Hong-lin Liao, William McLean, Jiwei Zhang

We consider a class of numerical approximations to the Caputo fractional derivative. Our assumptions permit the use of nonuniform time steps, such as is appropriate for accurately resolving the behavior of a solution whose derivatives are singular at~$t=0$. The main result is a type of fractional Grönwall inequality and we illustrate its use by outlining some stability and convergence estimates of schemes for fractional reaction-subdiffusion problems. This approach extends earlier work that used the familiar L1 approximation to the Caputo fractional derivative, and will facilitate the analysis of higher order and linearized fast schemes.

NADec 2, 2016
Analysis of $L1$-Galerkin FEMs for time-fractional nonlinear parabolic problems

Dongfang Li, Hong-lin Liao, Weiwei Sun et al.

This paper is concerned with numerical solutions of time-fractional nonlinear parabolic problems by a class of $L1$-Galerkin finite element methods. The analysis of $L1$ methods for time-fractional nonlinear problems is limited mainly due to the lack of a fundamental Gronwall type inequality. In this paper, we establish such a fundamental inequality for the $L1$ approximation to the Caputo fractional derivative. In terms of the Gronwall type inequality, we provide optimal error estimates of several fully discrete linearized Galerkin finite element methods for nonlinear problems. The theoretical results are illustrated by applying our proposed methods to three examples: linear Fokker-Planck equation, nonlinear Huxley equation and Fisher equation.

NANov 20, 2018
Sharp $H^1$-norm error estimates of two time-stepping schemes for reaction-subdiffusion problems

Jincheng Ren, Hong-lin Liao, Jiwei Zhang et al.

Due to the intrinsically initial singularity of solution and the discrete convolution form in numerical Caputo derivatives, the traditional $H^1$-norm analysis (corresponding to the case for a classical diffusion equation) to the time approximations of a fractional subdiffusion problem always leads to suboptimal error estimates (a loss of time accuracy). To recover the theoretical accuracy in time, we propose an improved discrete Grönwall inequality and apply it to the well-known L1 formula and a fractional Crank-Nicolson scheme. With the help of a time-space error-splitting technique and the global consistency analysis, sharp $H^1$-norm error estimates of the two nonuniform approaches are established for a reaction-subdiffusion problems. Numerical experiments are included to confirm the sharpness of our analysis.

NAApr 6, 2018
Unconditional convergence of a fast two-level linearized algorithm for semilinear subdiffusion equations

Hong-lin Liao, Yonggui Yan, Jiwei Zhang

A fast two-level linearized scheme with unequal time-steps is constructed and analyzed for an initial-boundary-value problem of semilinear subdiffusion equations. The two-level fast L1 formula of the Caputo derivative is derived based on the sum-of-exponentials technique. The resulting fast algorithm is computationally efficient in long-time simulations because it significantly reduces the computational cost $O(MN^2)$ and storage $O(MN)$ for the standard L1 formula to $O(MN\log N)$ and $O(M\log N)$, respectively, for $M$ grid points in space and $N$ levels in time. The nonuniform time mesh would be graded to handle the typical singularity of the solution near the time $t=0$, and Newton linearization is used to approximate the nonlinearity term. Our analysis relies on three tools: a new discrete fractional Grönwall inequality, a global consistency analysis and a discrete $H^2$ energy method. A sharp error estimate reflecting the regularity of solution is established without any restriction on the relative diameters of the temporal and spatial mesh sizes. Numerical examples are provided to demonstrate the effectiveness of our approach and the sharpness of error analysis.

NAMar 6, 2018
An Accurate and Efficient Algorithm for The Time-fractional Molecular Beam Epitaxy Model with Slope Selection

Lizhen Chen, Jia Zhao, Waixiang Cao et al.

In this paper, we propose a time-fractional molecular beam epitaxy (MBE) model with slope selection and its efficient, accurate, full discrete, linear numerical approximation. The numerical scheme utilizes the fast algorithm for the Caputo fractional derivative operator in time discretization and Fourier spectral method in spatial discretization. Refinement tests are conducted to verify the $2-α$ order of time convergence, with $α\in (0, 1]$ the fractional order of derivative. Several numerical simulations are presented to demonstrate the accuracy and efficiency of our newly proposed scheme. By exploring the fast algorithm calculating the Caputo fractional derivative, our numerical scheme makes it practice for long time simulation of MBE coarsening, which is essential for MBE model in practice. With the proposed fractional MBE model, we observe that the scaling law for the energy decays as $ O(t^{-\fracα{3}})$ and the roughness increases as $O(t^{\fracα{3}})$, during the coarsening dynamics with random initial condition. That is to say, the coarsening rate of MBE model could be manipulated by the fractional order $α$, and it is linearly proportional to $α$. This is the first time in literature to report/discover such scaling correlation. It provides a potential application field for fractional differential equations. Besides, the numerical approximation strategy proposed in this paper can be readily applied to study many classes of time-fractional and high dimensional phase field models.

NAJun 11, 2008
A unified approach to split absorbing boundary conditions for nonlinear Schrödinger equations

Jiwei Zhang, Zhenli Xu, Xiaonan Wu

An efficient method is proposed for numerical solutions of nonlinear Schrödinger equations in an unbounded domain. Through approximating the kinetic energy term by a one-way equation and uniting it with the potential energy equation, absorbing boundary conditions are designed to truncate the unbounded domain, which are in nonlinear form and can perfectly absorb the waves outgoing from the truncated domain. We examine the stability of the induced initial boundary value problems defined on the computational domain with the boundary conditions by a normal mode analysis. Numerical examples are given to illustrate the stable and tractable advantages of the method.

NAMar 21, 2018
Superconvergence Points of Integer and Fractional Derivatives of Special Hermite Interpolations and Its Applications in Solving FDEs

Beichuan Deng, Jiwei Zhang, Zhimin Zhang

In this paper, we study convergence and superconvergence theory of integer and fractional derivatives of the one-point and the two-point Hermite interpolations. When considering the integer-order derivative, exponential decay of the error is proved, and superconvergence points are located, at which the convergence rates are $O(N^{-2})$ and $O(N^{-1.5})$, respectively, better than the global rate for the one-point and two-point interpolations. Here $N$ represents the degree of interpolation polynomial. It is proved that the $α$-th fractional derivative of $(u-u_N)$ with $k<α<k+1$, is bounded by its $(k+1)$-th derivative. Furthermore, the corresponding superconvergence points are predicted for fractional derivatives, and an eigenvalue method is proposed to calculate the superconvergence points for the Riemann-Liouville fractional derivative. In the application of the knowledge of superconvergence points to solve FDEs, we discover that a modified collocation method makes numerical solutions much more accurate than the traditional collocation method.

DSAug 25, 2022
Learning to Prune Instances of Steiner Tree Problem in Graphs

Jiwei Zhang, Deepak Ajwani

We consider the Steiner tree problem on graphs where we are given a set of nodes and the goal is to find a tree sub-graph of minimum weight that contains all nodes in the given set, potentially including additional nodes. This is a classical NP-hard combinatorial optimisation problem. In recent years, a machine learning framework called learning-to-prune has been successfully used for solving a diverse range of combinatorial optimisation problems. In this paper, we use this learning framework on the Steiner tree problem and show that even on this problem, the learning-to-prune framework results in computing near-optimal solutions at a fraction of the time required by commercial ILP solvers. Our results underscore the potential of the learning-to-prune framework in solving various combinatorial optimisation problems.

CVNov 16, 2025Code
D$^{2}$-VPR: A Parameter-efficient Visual-foundation-model-based Visual Place Recognition Method via Knowledge Distillation and Deformable Aggregation

Zheyuan Zhang, Jiwei Zhang, Boyu Zhou et al.

Visual Place Recognition (VPR) aims to determine the geographic location of a query image by retrieving its most visually similar counterpart from a geo-tagged reference database. Recently, the emergence of the powerful visual foundation model, DINOv2, trained in a self-supervised manner on massive datasets, has significantly improved VPR performance. This improvement stems from DINOv2's exceptional feature generalization capabilities but is often accompanied by increased model complexity and computational overhead that impede deployment on resource-constrained devices. To address this challenge, we propose $D^{2}$-VPR, a $D$istillation- and $D$eformable-based framework that retains the strong feature extraction capabilities of visual foundation models while significantly reducing model parameters and achieving a more favorable performance-efficiency trade-off. Specifically, first, we employ a two-stage training strategy that integrates knowledge distillation and fine-tuning. Additionally, we introduce a Distillation Recovery Module (DRM) to better align the feature spaces between the teacher and student models, thereby minimizing knowledge transfer losses to the greatest extent possible. Second, we design a Top-Down-attention-based Deformable Aggregator (TDDA) that leverages global semantic features to dynamically and adaptively adjust the Regions of Interest (ROI) used for aggregation, thereby improving adaptability to irregular structures. Extensive experiments demonstrate that our method achieves competitive performance compared to state-of-the-art approaches. Meanwhile, it reduces the parameter count by approximately 64.2% and FLOPs by about 62.6% (compared to CricaVPR).Code is available at https://github.com/tony19980810/D2VPR.

CVJan 6, 2025
InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models

Kai Wang, Shaozhang Niu, Qixian Hao et al.

As artificial intelligence advances rapidly, particularly with the advent of GANs and diffusion models, the accuracy of Image Inpainting Localization (IIL) has become increasingly challenging. Current IIL methods face two main challenges: a tendency towards overconfidence, leading to incorrect predictions; and difficulty in detecting subtle tampering boundaries in inpainted images. In response, we propose a new paradigm that treats IIL as a conditional mask generation task utilizing diffusion models. Our method, InpDiffusion, utilizes the denoising process enhanced by the integration of image semantic conditions to progressively refine predictions. During denoising, we employ edge conditions and introduce a novel edge supervision strategy to enhance the model's perception of edge details in inpainted objects. Balancing the diffusion model's stochastic sampling with edge supervision of tampered image regions mitigates the risk of incorrect predictions from overconfidence and prevents the loss of subtle boundaries that can result from overly stochastic processes. Furthermore, we propose an innovative Dual-stream Multi-scale Feature Extractor (DMFE) for extracting multi-scale features, enhancing feature representation by considering both semantic and edge conditions of the inpainted images. Extensive experiments across challenging datasets demonstrate that the InpDiffusion significantly outperforms existing state-of-the-art methods in IIL tasks, while also showcasing excellent generalization capabilities and robustness.

NAApr 8
A Locking-free and Loosely Coupled Robin-Robin Scheme for Fluid-Poroelasticity Interaction

Wenlong He, Thomas Wick, Xiaohe Yue et al.

We study a fluid-poroelasticity interaction (FPSI) problem coupling the unsteady Stokes equations with the fully dynamic Biot system. A major challenge in such problems is to design partitioned schemes that remain robust in locking-related parameter regimes while preserving the physical interface coupling structure.To address this issue, we introduce two auxiliary variables and reformulate the Biot system as a four-field problem consisting of a dynamic Stokes-like system coupled with a diffusion equation. Crucially, this reformulation preserves the original interface conditions. Based on Robin-Robin transmission conditions with explicitly lagged interface data, we construct a fully decoupled scheme in which the fluid and poroelastic subproblems can be solved independently and in parallel at each time step, without sub-iterations.We prove that the resulting method is unconditionally stable and derive optimal-order error estimates in the $H^1$-norm. The analysis further shows that the scheme is robust with respect to extreme poroelastic parameters and avoids the locking effects inherent in standard formulations. Numerical experiments confirm the theoretical convergence results and demonstrate the locking-robust performance of the proposed method.

CLAug 28, 2025
Feel the Difference? A Comparative Analysis of Emotional Arcs in Real and LLM-Generated CBT Sessions

Xiaoyi Wang, Jiwei Zhang, Guangtao Zhang et al.

Synthetic therapy dialogues generated by large language models (LLMs) are increasingly used in mental health NLP to simulate counseling scenarios, train models, and supplement limited real-world data. However, it remains unclear whether these synthetic conversations capture the nuanced emotional dynamics of real therapy. In this work, we introduce RealCBT, a dataset of authentic cognitive behavioral therapy (CBT) dialogues, and conduct the first comparative analysis of emotional arcs between real and LLM-generated CBT sessions. We adapt the Utterance Emotion Dynamics framework to analyze fine-grained affective trajectories across valence, arousal, and dominance dimensions. Our analysis spans both full dialogues and individual speaker roles (counselor and client), using real sessions from the RealCBT dataset and synthetic dialogues from the CACTUS dataset. We find that while synthetic dialogues are fluent and structurally coherent, they diverge from real conversations in key emotional properties: real sessions exhibit greater emotional variability, more emotion-laden language, and more authentic patterns of reactivity and regulation. Moreover, emotional arc similarity remains low across all pairings, with especially weak alignment between real and synthetic speakers. These findings underscore the limitations of current LLM-generated therapy data and highlight the importance of emotional fidelity in mental health applications. To support future research, our dataset RealCBT is released at https://gitlab.com/xiaoyi.wang/realcbt-dataset.

IRDec 5, 2021
Variational Autoencoder with CCA for Audio-Visual Cross-Modal Retrieval

Jiwei Zhang, Yi Yu, Suhua Tang et al.

Cross-modal retrieval is to utilize one modality as a query to retrieve data from another modality, which has become a popular topic in information retrieval, machine learning, and database. How to effectively measure the similarity between different modality data is the major challenge of cross-modal retrieval. Although several reasearch works have calculated the correlation between different modality data via learning a common subspace representation, the encoder's ability to extract features from multi-modal information is not satisfactory. In this paper, we present a novel variational autoencoder (VAE) architecture for audio-visual cross-modal retrieval, by learning paired audio-visual correlation embedding and category correlation embedding as constraints to reinforce the mutuality of audio-visual information. On the one hand, audio encoder and visual encoder separately encode audio data and visual data into two different latent spaces. Further, two mutual latent spaces are respectively constructed by canonical correlation analysis (CCA). On the other hand, probabilistic modeling methods is used to deal with possible noise and missing information in the data. Additionally, in this way, the cross-modal discrepancy from intra-modal and inter-modal information are simultaneously eliminated in the joint embedding subspace. We conduct extensive experiments over two benchmark datasets. The experimental outcomes exhibit that the proposed architecture is effective in learning audio-visual correlation and is appreciably better than the existing cross-modal retrieval methods.

ROJul 6, 2021
DL-AMP and DBTO: An Automatic Merge Planning and Trajectory Optimization and Its Application in Autonomous Driving

Yuncheng Jiang, Qi Lin, Jiwei Zhang et al.

This paper presents an automatic merging algorithm for autonomous driving vehicles, which decouples the specific motion planning problem into a Dual-Layer Automatic Merge Planning (DL_AMP) and a Descent-Based Trajectory Optimization (DBTO). This work leads to great improvements in finding the best merge opportunity, lateral and longitudinal merge planning and control, trajectory postprocessing and driving comfort.

LGJan 4, 2021
Frequency Principle in Deep Learning Beyond Gradient-descent-based Training

Yuheng Ma, Zhi-Qin John Xu, Jiwei Zhang

Frequency perspective recently makes progress in understanding deep learning. It has been widely verified in both empirical and theoretical studies that deep neural networks (DNNs) often fit the target function from low to high frequency, namely Frequency Principle (F-Principle). F-Principle sheds light on the strength and the weakness of DNNs and inspires a series of subsequent works, including theoretical studies, empirical studies and the design of efficient DNN structures etc. Previous works examine the F-Principle in gradient-descent-based training. It remains unclear whether gradient-descent-based training is a necessary condition for the F-Principle. In this paper, we show that the F-Principle exists stably in the training process of DNNs with non-gradient-descent-based training, including optimization algorithms with gradient information, such as conjugate gradient and BFGS, and algorithms without gradient information, such as Powell's method and Particle Swarm Optimization. These empirical studies show the universality of the F-Principle and provide hints for further study of F-Principle.

LGDec 6, 2019
A priori generalization error for two-layer ReLU neural network through minimum norm solution

Zhi-Qin John Xu, Jiwei Zhang, Yaoyu Zhang et al.

We focus on estimating \emph{a priori} generalization error of two-layer ReLU neural networks (NNs) trained by mean squared error, which only depends on initial parameters and the target function, through the following research line. We first estimate \emph{a priori} generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by \cite{zhang2019type} to be an equivalent solution of a linearized (w.r.t. parameter) finite-width two-layer NN. As the width goes to infinity, the linearized NN converges to the NN in Neural Tangent Kernel (NTK) regime \citep{jacot2018neural}. Thus, we can derive the \emph{a priori} generalization error of two-layer ReLU NN in NTK regime. The distance between NN in a NTK regime and a finite-width NN with gradient training is estimated by \cite{arora2019exact}. Based on the results in \cite{arora2019exact}, our work proves an \emph{a priori} generalization error bound of two-layer ReLU NNs. This estimate uses the intrinsic implicit bias of the minimum norm solution without requiring extra regularity in the loss function. This \emph{a priori} estimate also implies that NN does not suffer from curse of dimensionality, and a small generalization error can be achieved without requiring exponentially large number of neurons. In addition the research line proposed in this paper can also be used to study other properties of the finite-width network, such as the posterior generalization error.

NAApr 28, 2019
A second-order scheme with nonuniform time steps for a linear reaction-sudiffusion problem

Hong-lin Liao, William McLean, Jiwei Zhang

Stability and convergence of a time-weighted discrete scheme with nonuniform time steps are established for linear reaction-subdiffusion equations. The Caupto derivative is approximated at an offset point by using linear and quadratic polynomial interpolation. Our analysis relies on two tools: a discrete fractional Grönwall inequality and the global consistency analysis. The new consistency analysis makes use of an interpolation error formula for quadratic polynomials, which leads to a convolution-type bound for the local truncation error. To exploit these two tools, some theoretical properties of the discrete kernels in the numerical Caputo formula are crucial and we investigate them intensively in the nonuniform setting. Taking the initial singularity of the solution into account, we obtain a sharp error estimate on nonuniform time meshes. The fully discrete scheme generates a second-order accurate solution on the graded mesh provided a proper grading parameter is employed. An example is presented to show the sharpness of our analysis.

COMP-PHOct 14, 2018
A Unified Gas-kinetic Particle Method for Multiscale Photon Transport

Weiming Li, Chang Liu, Yajun Zhu et al.

In this work, we present a unified gas-kinetic particle (UGKP) method for the simulation of multiscale photon transport. The multiscale nature of the particle method mainly comes from the recovery of the time evolution flux function in the unified gas-kinetic scheme (UGKS) through a coupled dynamic process of particle transport and collision. This practice improves the original operator splitting approach in the Monte Carlo method, such as the separated treatment of particle transport and collision. As a result, with the variation of the ratio between numerical time step and local photon's collision time, different transport physics can be fully captured in a single computation. In the diffusive limit, the UGKP method could recover the solution of the diffusion equation with the cell size and time step being much larger than the photon's mean free path and the mean collision time. In the free transport limit, it presents an exact particle tracking process as the original Monte Carlo method. In the transition regime, the weights of particle free transport and collision are determined by the ratio of local numerical time step to the photon's collision time. Several one-dimensional numerical examples covering all transport regimes from the optically thin to optically thick are computed to validate the accuracy and efficiency of the current scheme. In comparison with the $S_N$ discrete ordinate method, the UGKP method is based on particles and avoids the discretization of particle velocity space, which does not suffer from the ray effect.