NANov 7, 2016
Positivity-preserving and asymptotic preserving method for 2D Keller-Segal equationsJian-Guo Liu, Li Wang, Zhennan Zhou
We propose a semi-discrete scheme for 2D Keller-Segel equations based on a symmetrization reformation, which is equivalent to the convex splitting method and is free of any nonlinear solver. We show that, this new scheme is unconditionally stable as long as the initial condition does not exceed certain threshold, and it asymptotically preserves the quasi-static limit in the transient regime. Furthermore, we prove that the fully discrete scheme is conservative and positivity preserving, which makes it ideal for simulations. The analogical schemes for the radial symmetric cases and the subcritical degenerate cases are also presented and analyzed. With extensive numerical tests, we verify the claimed properties of the methods and demonstrate their superiority in various challenging applications.
APFeb 2, 2018
Analysis and computation of some tumor growth models with nutrient: from cell density models to free boundary dynamicsJian-Guo Liu, Min Tang, Li Wang et al.
In this paper, we study the tumor growth equation along with various models for the nutrient component, including the \emph{in vitro} model and the \emph{in vivo} model. At the cell density level, the spatial availability of the tumor density $n$ is governed by the Darcy law via the pressure $p(n)=n^γ$. For finite $γ$, we prove some a priori estimates of the tumor growth model, such as boundedness of the nutrient density, and non-negativity and growth estimate of the tumor density. As $γ\rightarrow \infty$, the cell density models formally converge to Hele-Shaw flow models, which determine the free boundary dynamics of the tumor tissue in the incompressible limit. We derive several analytical solutions to the Hele-Shaw flow models, which serve as benchmark solutions to the geometric motion of tumor front propagation. Finally, we apply a conservative and positivity preserving numerical scheme to the cell density models, with numerical results verifying the link between cell density models and the free boundary dynamical models.
NAOct 11, 2016
Explicit and implicit TVD schemes for conservation laws with Caputo derivativesJian-Guo Liu, Zheng Ma, Zhennan Zhou
In this paper, we investigate numerical approximations of the scalar conservation law with the Caputo derivative, which introduces the memory effect. We construct the first order and the second order explicit upwind schemes for such equations, which are shown to be conditionally $\ell^1$ contracting and TVD. However, the Caputo derivative leads to the modified CFL-type stability condition, $ (Δt)^α = O(Δx)$, where $α\in (0,1]$ is the fractional exponent in the derivative. When $α$ small, such strong constraint makes the numerical implementation extremely impractical. We have then proposed the implicit upwind scheme to overcome this issue, which is proved to be unconditionally $\ell^1$ contracting and TVD. Various numerical tests are presented to validate the properties of the methods and provide more numerical evidence in interpreting the memory effect in conservation laws.
CHEM-PHApr 2, 2017
Path integral molecular dynamics with surface hopping for thermal equilibrium sampling of nonadiabatic systemsJianfeng Lu, Zhennan Zhou
In this work, a novel ring polymer representation for multi-level quantum system is proposed for thermal average calculations. The proposed presentation keeps the discreteness of the electronic states: besides position and momentum, each bead in the ring polymer is also characterized by a surface index indicating the electronic energy surface. A path integral molecular dynamics with surface hopping (PIMD-SH) dynamics is also developed to sample the equilibrium distribution of ring polymer configurational space. The PIMD-SH sampling method is validated theoretically and by numerical examples.
NAFeb 11, 2018
The Gaussian wave packets transform for the semi-classical Schrödinger equation with vector potentialsZhennan Zhou, Giovanni Russo
In this paper, we reformulate the semi-classical Schrödinger equation in the presence of electromagnetic field by the Gaussian wave packets transform. With this approach, the highly oscillatory Schrödinger equation is equivalently transformed into another Schrödinger type wave equation, the $w$ equation, which is essentially not oscillatory and thus requires much less computational effort. We propose two numerical methods to solve the $w$ equation, where the Hamiltonian is either divided into the kinetic, the potential and the convection part, or into the kinetic and the potential-convection part. The convection, or the potential-convection part is treated by a semi-Lagrangian method, while the kinetic part is solved by the Fourier spectral method. The numerical methods are proved to be unconditionally stable, spectrally accurate in space and second order accurate in time, and in principle they can be extended to higher order schemes in time. Various one dimensional and multidimensional numerical tests are provided to justify the properties of the proposed methods.
NAMar 22, 2017
Frozen Gaussian approximation with surface hopping for mixed quantum-classical dynamics: A mathematical justification of fewest switches surface hopping algorithmsJianfeng Lu, Zhennan Zhou
We develop a surface hopping algorithm based on frozen Gaussian approximation for semiclassical matrix Schrödinger equations, in the spirit of Tully's fewest switches surface hopping method. The algorithm is asymptotically derived from the Schrödinger equation with rigorous approximation error analysis. The resulting algorithm can be viewed as a path integral stochastic representation of the semiclassical matrix Schrödinger equations. Our results provide mathematical understanding to and shed new light on the important class of surface hopping methods in theoretical and computational chemistry.
CHEM-PHJan 29, 2018
Accelerated sampling by infinite swapping of path integral molecular dynamics with surface hoppingJianfeng Lu, Zhennan Zhou
To accelerate the thermal equilibrium sampling of multi-level quantum systems, the infinite swapping limit of a recently proposed multi-level ring polymer representation is investigated. In the infinite swapping limit, the ring polymer evolves according to an averaged Hamiltonian with respect to all possible surface index configurations of the ring polymer, thus connects the surface hopping approach to the mean-field path integral molecular dynamics. A multiscale integrator for the infinite swapping limit is also proposed to enable efficient sampling based on the limiting dynamics. Numerical results demonstrate the huge improvement of sampling efficiency of the infinite swapping compared with the direct simulation of path integral molecular dynamics with surface hopping.
NADec 13, 2017
On Runge-Kutta methods for the water wave equation and its simplified nonlocal hyperbolic modelLei Li, Jian-Guo Liu, Zibu Liu et al.
There is a growing interest in investigating numerical approximations of the water wave equation in recent years, whereas the lack of rigorous analysis of its time discretization inhibits the design of more efficient algorithms. In this work, we focus on a nonlocal hyperbolic model, which essentially inherits the features of the water wave equation, and is simplified from the latter. For the constant coefficient case, we carry out systematical stability studies of the fully discrete approximation of such systems with the Fourier spectral approximation in space and general Runge-Kutta method in time. In particular, we discover the optimal time step constraints, in the form of a modified CFL condition, when certain explicit Runge-Kutta method are applied. Besides, the convergence of the semi-discrete approximation of variable coefficient case is shown, which naturally connects to the water wave equation. Extensive numerical tests have been performed to verify the stability conditions and simulations of the simplified hyperbolic model in the high frequency regime and the water wave equation are also provided.
CLOct 14, 2024Code
Jailbreak Instruction-Tuned LLMs via end-of-sentence MLP Re-weightingYifan Luo, Zhennan Zhou, Meitan Wang et al.
In this paper, we investigate the safety mechanisms of instruction fine-tuned large language models (LLMs). We discover that re-weighting MLP neurons can significantly compromise a model's safety, especially for MLPs in end-of-sentence inferences. We hypothesize that LLMs evaluate the harmfulness of prompts during end-of-sentence inferences, and MLP layers plays a critical role in this process. Based on this hypothesis, we develop 2 novel white-box jailbreak methods: a prompt-specific method and a prompt-general method. The prompt-specific method targets individual prompts and optimizes the attack on the fly, while the prompt-general method is pre-trained offline and can generalize to unseen harmful prompts. Our methods demonstrate robust performance across 7 popular open-source LLMs, size ranging from 2B to 72B. Furthermore, our study provides insights into vulnerabilities of instruction-tuned LLM's safety and deepens the understanding of the internal mechanisms of LLMs.
LGOct 22, 2023
Prompt Engineering Through the Lens of Optimal ControlYifan Luo, Yiming Tang, Chengfeng Shen et al.
Prompt Engineering (PE) has emerged as a critical technique for guiding Large Language Models (LLMs) in solving intricate tasks. Its importance is highlighted by its potential to significantly enhance the efficiency and effectiveness of human-machine interaction. As tasks grow increasingly complex, recent advanced PE methods have extended beyond the limitations of single-round interactions to embrace multi-round interactions, which allows for a deeper and more nuanced engagement with LLMs. In this paper, we propose an optimal control framework tailored for multi-round interactions with LLMs. This framework provides a unified mathematical structure that not only systematizes the existing PE methods but also sets the stage for rigorous analytical improvements. Furthermore, we extend this framework to include PE via ensemble methods and multi-agent collaboration, thereby enlarging the scope of applicability. By adopting an optimal control perspective, we offer fresh insights into existing PE methods and highlight theoretical challenges that warrant future research. Besides, our work lays a foundation for the development of more effective and interpretable PE methods.
61.1CLMay 13
STOP: Structured On-Policy Pruning of Long-Form Reasoning in Low-Data RegimesChenjun Xu, Zhennan Zhou, Zhan Su et al.
Long chain-of-thought (Long CoT) reasoning improves performance on multi-step problems, but it also induces overthinking: models often generate low-yield reasoning that increases inference cost and latency. This inefficiency is especially problematic in low-data fine-tuning regimes, where real applications adapt reasoning models with limited supervision and cannot rely on large-scale teacher distillation or heavy test-time control. To address this, we propose STOP (Structured On-policy Pruning), an on-policy algorithm for analyzing and pruning long-form reasoning traces. STOP constructs self-distilled traces from the model. Then it maps each trace into a structured reasoning interface through node segmentation, taxonomy annotation, and reasoning-tree construction. On top of this interface, we introduce ECN (Earliest Correct Node), which retains the shortest prefix ending at the earliest node that both functions as an answering conclusion and yields the correct final answer, removing redundant post-solution reasoning while preserving semantic continuity. Experiments on DeepSeek-R1-Distill-Qwen-7B and DeepSeek-R1-Distill-LLaMA-3-8B across GSM8K, Math 500, and AIME 2024 show that STOP reduces generated tokens by 19.4-42.4% while largely preserving accuracy in low-data fine-tuning. Beyond efficiency, our analyses show that STOP induces much smaller distributional shift than teacher-guided pruning, improves the structural efficiency of generated reasoning, and reallocates reasoning effort away from redundant verification and backtracking toward more productive exploration.
AIFeb 12
From Atoms to Trees: Building a Structured Feature Forest with Hierarchical Sparse AutoencodersYifan Luo, Yang Zhan, Jiedong Jiang et al.
Sparse autoencoders (SAEs) have proven effective for extracting monosemantic features from large language models (LLMs), yet these features are typically identified in isolation. However, broad evidence suggests that LLMs capture the intrinsic structure of natural language, where the phenomenon of "feature splitting" in particular indicates that such structure is hierarchical. To capture this, we propose the Hierarchical Sparse Autoencoder (HSAE), which jointly learns a series of SAEs and the parent-child relationships between their features. HSAE strengthens the alignment between parent and child features through two novel mechanisms: a structural constraint loss and a random feature perturbation mechanism. Extensive experiments across various LLMs and layers demonstrate that HSAE consistently recovers semantically meaningful hierarchies, supported by both qualitative case studies and rigorous quantitative metrics. At the same time, HSAE preserves the reconstruction fidelity and interpretability of standard SAEs across different dictionary sizes. Our work provides a powerful, scalable tool for discovering and analyzing the multi-scale conceptual structures embedded in LLM representations.
69.7NAMar 19
Resolving the Blow-Up: A Time-Dilated Numerical Framework for Multiple Firing Events in Mean-Field Neuronal NetworksXu'an Dou, Louis Tao, Zhe Xue et al.
In large-scale excitatory neuronal networks, rapid synchronization manifests as {multiple firing events (MFEs)}, mathematically characterized by a finite-time blow-up of the neuronal firing rate in the mean-field Fokker-Planck equation. Standard numerical methods struggle to resolve this singularity due to the divergent boundary flux and the instantaneous nature of the population voltage reset. In this work, we propose a robust {multiscale numerical framework based on time dilation}. By transforming the governing equation into a dilated timescale proportional to the firing activity, we desingularize the blow-up, effectively stretching the instantaneous synchronization event into a resolved mesoscopic process. This approach is shown to be physically consistent with the {microscopic cascade mechanism} underlying MFEs and the system's inherent fragility. To implement this numerically, we develop a hybrid scheme that utilizes a {mesh-independent flux criterion} to switch between timescales and a semi-analytical ``moving Gaussian'' method to accurately evolve the post-blowup Dirac mass. Numerical benchmarks demonstrate that our solver not only captures steady states with high accuracy but also efficiently reproduces periodic MFEs, matching Monte Carlo simulations without the severe time-step restrictions associated with particle cascades.
LGJun 9, 2025
InverseScope: Scalable Activation Inversion for Interpreting Large Language ModelsYifan Luo, Zhennan Zhou, Bin Dong
Understanding the internal representations of large language models (LLMs) is a central challenge in interpretability research. Existing feature interpretability methods often rely on strong assumptions about the structure of representations that may not hold in practice. In this work, we introduce InverseScope, an assumption-light and scalable framework for interpreting neural activations via input inversion. Given a target activation, we define a distribution over inputs that generate similar activations and analyze this distribution to infer the encoded information. To address the inefficiency of sampling in high-dimensional spaces, we propose a novel conditional generation architecture that significantly improves sample efficiency compared to previous method. We further introduce a quantitative evaluation protocol that tests interpretability hypotheses using the feature consistency rate computed over the sampled inputs. InverseScope scales inversion-based interpretability methods to larger models and practical tasks, enabling systematic and quantitative analysis of internal representations in real-world LLMs.
NAAug 28, 2017
An accurate front capturing scheme for tumor growth models with a free boundary limitJian-Guo Liu, Min Tang, Li Wang et al.
We consider a class of tumor growth models under the combined effects of density-dependent pressure and cell multiplication, with a free boundary model as its singular limit when the pressure-density relationship becomes highly nonlinear. In particular, the constitutive law connecting pressure $p$ and density $ρ$ is $p(ρ)=\frac{m}{m-1} ρ^{m-1}$, and when $m \gg 1$, the cell density $ρ$ may evolve its support due to a pressure-driven geometric motion with sharp interface along the boundary of its support. The nonlinearity and degeneracy in the diffusion bring great challenges in numerical simulations, let alone the capturing of the singular free boundary limit. Prior to the present paper, there is lack of standard mechanism to numerically capture the front propagation speed as $m\gg 1$. In this paper, we develope a numerical scheme based on a novel prediction-correction reformulation that can accurately approximate the front propagation even when the nonlinearity is extremely strong. We show that the semi-discrete scheme naturally connects to the free boundary limit equation as $m \rightarrow \infty$, and with proper spacial discretization, the fully discrete scheme has improved stability, preserves positivity, and implements without nonlinear solvers. Finally, extensive numerical examples in both one and two dimensions are provided to verify the claimed properties and showcase good performance in various applications.
CHEM-PHSep 8, 2016
Improved sampling and validation of frozen Gaussian approximation with surface hopping algorithm for nonadiabatic dynamicsJianfeng Lu, Zhennan Zhou
In the spirit of the fewest switches surface hopping, the frozen Gaussian approximation with surface hopping (FGA-SH) method samples a path integral representation of the non-adiabatic dynamics in the semiclassical regime. An improved sampling scheme is developed in this work for FGA-SH based on birth and death branching processes. The algorithm is validated for the standard test examples of non-adiabatic dynamics.