Kui Ren

LG
h-index20
14papers
222citations
Novelty45%
AI Score38

14 Papers

15.3CVSep 18, 2023Code
DFIL: Deepfake Incremental Learning by Exploiting Domain-invariant Forgery Clues

Kun Pan, Yin Yifang, Yao Wei et al.

The malicious use and widespread dissemination of deepfake pose a significant crisis of trust. Current deepfake detection models can generally recognize forgery images by training on a large dataset. However, the accuracy of detection models degrades significantly on images generated by new deepfake methods due to the difference in data distribution. To tackle this issue, we present a novel incremental learning framework that improves the generalization of deepfake detection models by continual learning from a small number of new samples. To cope with different data distributions, we propose to learn a domain-invariant representation based on supervised contrastive learning, preventing overfit to the insufficient new data. To mitigate catastrophic forgetting, we regularize our model in both feature-level and label-level based on a multi-perspective knowledge distillation approach. Finally, we propose to select both central and hard representative samples to update the replay set, which is beneficial for both domain-invariant representation learning and rehearsal-based knowledge preserving. We conduct extensive experiments on four benchmark datasets, obtaining the new state-of-the-art average forgetting rate of 7.01 and average accuracy of 85.49 on FF++, DFDC-P, DFD, and CDF2. Our code is released at https://github.com/DeepFakeIL/DFIL.

1.2APNov 2, 2015
On the modeling and simulation of reaction-transfer dynamics in semiconductor-electrolyte solar cells

Yuan He, Irene M. Gamba, Heung-Chan Lee et al.

The mathematical modeling and numerical simulation of semiconductor-electrolyte systems play important roles in the design of high-performance semiconductor-liquid junction solar cells. In this work, we propose a macroscopic mathematical model, a system of nonlinear partial differential equations, for the complete description of charge transfer dynamics in such systems. The model consists of a reaction-drift-diffusion-Poisson system that models the transport of electrons and holes in the semiconductor region and an equivalent system that describes the transport of reductants and oxidants, as well as other charged species, in the electrolyte region. The coupling between the semiconductor and the electrolyte is modeled through a set of interfacial reaction and current balance conditions. We present some numerical simulations to illustrate the quantitative behavior of the semiconductor-electrolyte system in both dark and illuminated environments. We show numerically that one can replace the electrolyte region in the system with a Schottky contact only when the bulk reductant-oxidant pair density is extremely high. Otherwise, such replacement gives significantly inaccurate description of the real dynamics of the semiconductor-electrolyte system.

8.9OCFeb 8, 2023
Adaptive State-Dependent Diffusion for Derivative-Free Optimization

Björn Engquist, Kui Ren, Yunan Yang

This paper develops and analyzes a stochastic derivative-free optimization strategy. A key feature is the state-dependent adaptive variance. We prove global convergence in probability with algebraic rate and give the quantitative results in numerical examples. A striking fact is that convergence is achieved without explicit information of the gradient and even without comparing different objective function values as in established methods such as the simplex method and simulated annealing. It can otherwise be compared to annealing with state-dependent temperature.

1.2APApr 28, 2017
Nonlinear quantitative photoacoustic tomography with two-photon absorption

Kui Ren, Rongting Zhang

Two-photon photoacoustic tomography (TP-PAT) is a non-invasive optical molecular imaging modality that aims at inferring two-photon absorption property of heterogeneous media from photoacoustic measurements. In this work, we analyze an inverse problem in quantitative TP-PAT where we intend to reconstruct optical coefficients in a semilinear elliptic PDE, the mathematical model for the propagation of near infra-red photons in tissue-like optical media with two-photon absorption, from the internal absorbed energy data. We derive uniqueness and stability results on the reconstructions of single and multiple optical coefficients, and present some numerical reconstruction results based on synthetic data to complement the theoretical analysis.

7.0OCApr 12, 2022
An Algebraically Converging Stochastic Gradient Descent Algorithm for Global Optimization

Björn Engquist, Kui Ren, Yunan Yang

We propose a new gradient descent algorithm with added stochastic terms for finding the global optimizers of nonconvex optimization problems. A key component in the algorithm is the adaptive tuning of the randomness based on the value of the objective function. In the language of simulated annealing, the temperature is state-dependent. With this, we prove the global convergence of the algorithm with an algebraic rate both in probability and in the parameter space. This is a significant improvement over the classical rate from using a more straightforward control of the noise term. The convergence proof is based on the actual discrete setup of the algorithm, not just its continuous limit as often done in the literature. We also present several numerical examples to demonstrate the efficiency and robustness of the algorithm for reasonably complex objective functions.

1.2NADec 7, 2018
Characterizing impacts of model uncertainties in quantitative photoacoustics

Kui Ren, Sarah Vallélian

This work is concerned with uncertainty quantification problems for image reconstructions in quantitative photoacoustic imaging (PAT), a recent hybrid imaging modality that utilizes the photoacoustic effect to achieve high-resolution imaging of optical properties of tissue-like heterogeneous media. We quantify mathematically and computationally the impact of uncertainties in various model parameters of PAT on the accuracy of reconstructed optical properties. We derive, via sensitivity analysis, analytical bounds on error in image reconstructions in some simplified settings, and develop a computational procedure, based on the method of polynomial chaos expansion, for such error characterization in more general settings. Numerical simulations based on synthetic data are presented to illustrate the main ideas.

17.6LGFeb 12, 2024
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning

Z Liu, J Lou, W Bao et al.

Fine-tuning on task-specific datasets is a widely-embraced paradigm of harnessing the powerful capability of pretrained LLMs for various downstream tasks. Due to the popularity of LLMs fine-tuning and its accompanying privacy concerns, differentially private (DP) fine-tuning of pretrained LLMs has been widely used to safeguarding the privacy of task-specific datasets. Lying at the design core of DP LLM fine-tuning methods is the satisfactory tradeoff among privacy, utility, and scalability. Most existing methods build upon the seminal work of DP-SGD. Despite pushing the scalability of DP-SGD to its limit, DP-SGD-based fine-tuning methods are unfortunately limited by the inherent inefficiency of SGD. In this paper, we investigate the potential of DP zeroth-order methods for LLM pretraining, which avoids the scalability bottleneck of SGD by approximating the gradient with the more efficient zeroth-order gradient. Rather than treating the zeroth-order method as a drop-in replacement for SGD, this paper presents a comprehensive study both theoretically and empirically. First, we propose the stagewise DP zeroth-order method (DP-ZOSO) that dynamically schedules key hyperparameters. This design is grounded on the synergy between DP random perturbation and the gradient approximation error of the zeroth-order method, and its effect on fine-tuning trajectory. We provide theoretical analysis for both proposed methods. We conduct extensive empirical analysis on both encoder-only masked language model and decoder-only autoregressive language model, achieving impressive results in terms of scalability and utility regardless of the class of tasks (compared with DPZero, DP-ZOPO improves $4.5\%$ on SST-5, $5.5\%$ on MNLI with RoBERTa-Large and 9.2\% on CB, 3.9\% on BoolQ with OPT-2.7b when $ε=4$, demonstrates more significant enhancement in performance on more complicated tasks).

13.5CVDec 16, 2024Code
FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

Gaojian Wang, Feng Lin, Tong Wu et al.

This work asks: with abundant, unlabeled real faces, how to learn a robust and transferable facial representation that boosts various face security tasks with respect to generalization performance? We make the first attempt and propose a self-supervised pretraining framework to learn fundamental representations of real face images, FSFM, that leverages the synergy between masked image modeling (MIM) and instance discrimination (ID). We explore various facial masking strategies for MIM and present a simple yet powerful CRFR-P masking, which explicitly forces the model to capture meaningful intra-region consistency and challenging inter-region coherency. Furthermore, we devise the ID network that naturally couples with MIM to establish underlying local-to-global correspondence via tailored self-distillation. These three learning objectives, namely 3C, empower encoding both local features and global semantics of real faces. After pretraining, a vanilla ViT serves as a universal vision foundation model for downstream face security tasks: cross-dataset deepfake detection, cross-domain face anti-spoofing, and unseen diffusion facial forgery detection. Extensive experiments on 10 public datasets demonstrate that our model transfers better than supervised pretraining, visual and facial self-supervised learning arts, and even outperforms task-specialized SOTA methods.

2.6LGNov 20, 2024
Sampling with Adaptive Variance for Multimodal Distributions

Björn Engquist, Kui Ren, Yunan Yang

We propose and analyze a class of adaptive sampling algorithms for multimodal distributions on a bounded domain, which share a structural resemblance to the classic overdamped Langevin dynamics. We first demonstrate that this class of linear dynamics with adaptive diffusion coefficients and vector fields can be interpreted and analyzed as weighted Wasserstein gradient flows of the Kullback--Leibler (KL) divergence between the current distribution and the target Gibbs distribution, which directly leads to the exponential convergence of both the KL and $χ^2$ divergences, with rates depending on the weighted Wasserstein metric and the Gibbs potential. We then show that a derivative-free version of the dynamics can be used for sampling without gradient information of the Gibbs potential and that for Gibbs distributions with nonconvex potentials, this approach could achieve significantly faster convergence than the classical overdamped Langevin dynamics. A comparison of the mean transition times between local minima of a nonconvex potential further highlights the better efficiency of the derivative-free dynamics in sampling.

4.1LGSep 4, 2025
Instance-Wise Adaptive Sampling for Dataset Construction in Approximating Inverse Problem Solutions

Jiequn Han, Kui Ren, Nathan Soedjak

We propose an instance-wise adaptive sampling framework for constructing compact and informative training datasets for supervised learning of inverse problem solutions. Typical learning-based approaches aim to learn a general-purpose inverse map from datasets drawn from a prior distribution, with the training process independent of the specific test instance. When the prior has a high intrinsic dimension or when high accuracy of the learned solution is required, a large number of training samples may be needed, resulting in substantial data collection costs. In contrast, our method dynamically allocates sampling effort based on the specific test instance, enabling significant gains in sample efficiency. By iteratively refining the training dataset conditioned on the latest prediction, the proposed strategy tailors the dataset to the geometry of the inverse map around each test instance. We demonstrate the effectiveness of our approach in the inverse scattering problem under two types of structured priors. Our results show that the advantage of the adaptive method becomes more pronounced in settings with more complex priors or higher accuracy requirements. While our experiments focus on a particular inverse problem, the adaptive sampling strategy is broadly applicable and readily extends to other inverse problems, offering a scalable and practical alternative to conventional fixed-dataset training regimes.

7.1LGFeb 21, 2025Code
CoKV: Optimizing KV Cache Allocation via Cooperative Game

Qiheng Sun, Hongwei Zhang, Haocheng Xia et al.

Large language models (LLMs) have achieved remarkable success on various aspects of human life. However, one of the major challenges in deploying these models is the substantial memory consumption required to store key-value pairs (KV), which imposes significant resource demands. Recent research has focused on KV cache budget allocation, with several approaches proposing head-level budget distribution by evaluating the importance of individual attention heads. These methods, however, assess the importance of heads independently, overlooking their cooperative contributions within the model, which may result in a deviation from their true impact on model performance. In light of this limitation, we propose CoKV, a novel method that models the cooperation between heads in model inference as a cooperative game. By evaluating the contribution of each head within the cooperative game, CoKV can allocate the cache budget more effectively. Extensive experiments show that CoKV achieves state-of-the-art performance on the LongBench benchmark using LLama-3-8B-Instruct and Mistral-7B models.

14.9CLJun 6, 2024
A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions

Lei Liu, Xiaoyan Yang, Junchi Lei et al.

With the advent of Large Language Models (LLMs), medical artificial intelligence (AI) has experienced substantial technological progress and paradigm shifts, highlighting the potential of LLMs to streamline healthcare delivery and improve patient outcomes. Considering this rapid technical progress, in this survey, we trace the recent advances of Medical Large Language Models (Med-LLMs), including the background, key findings, and mainstream techniques, especially for the evolution from general-purpose models to medical-specialized applications. Firstly, we delve into the foundational technology of Med-LLMs, indicating how general models can be progressively adapted and refined for the complicated medical tasks. Secondly, the wide-ranging applications of Med-LLMs are investigated across various healthcare domains, as well as an up-to-date review of existing Med-LLMs. The transformative impact of these models on daily medical practice is evident through their ability to assist clinicians, educators, and patients. Recognizing the importance of responsible innovation, we discuss the challenges associated with ensuring fairness, accountability, privacy, and robustness. Ethical considerations, rigorous evaluation methodologies, and the establishment of regulatory frameworks are crucial for building trustworthiness in the real-world system. We emphasize the need for ongoing scrutiny and development to maintain high standards of safety and reliability. Finally, we anticipate possible future trajectories for Med-LLMs, identifying key avenues for prudent expansion. By consolidating these insights, our review aims to provide professionals and researchers with a thorough understanding of the strengths and limitations of Med-LLMs, fostering a balanced and ethical approach to their integration into the healthcare ecosystem.

1.8LGJan 23, 2022
A Generalized Weighted Optimization Method for Computational Learning and Inversion

Björn Engquist, Kui Ren, Yunan Yang

The generalization capacity of various machine learning models exhibits different phenomena in the under- and over-parameterized regimes. In this paper, we focus on regression models such as feature regression and kernel regression and analyze a generalized weighted least-squares optimization method for computational learning and inversion with noisy data. The highlight of the proposed framework is that we allow weighting in both the parameter space and the data space. The weighting scheme encodes both a priori knowledge on the object to be learned and a strategy to weight the contribution of different data points in the loss function. Here, we characterize the impact of the weighting scheme on the generalization error of the learning method, where we derive explicit generalization errors for the random Fourier feature model in both the under- and over-parameterized regimes. For more general feature maps, error bounds are provided based on the singular values of the feature matrix. We demonstrate that appropriate weighting from prior knowledge can improve the generalization capability of the learned model.

5.1NANov 15, 2019
The quadratic Wasserstein metric for inverse data matching

Bjorn Engquist, Kui Ren, Yunan Yang

This work characterizes, analytically and numerically, two major effects of the quadratic Wasserstein ($W_2$) distance as the measure of data discrepancy in computational solutions of inverse problems. First, we show, in the infinite-dimensional setup, that the $W_2$ distance has a smoothing effect on the inversion process, making it robust against high-frequency noise in the data but leading to a reduced resolution for the reconstructed objects at a given noise level. Second, we demonstrate that for some finite-dimensional problems, the $W_2$ distance leads to optimization problems that have better convexity than the classical $L^2$ and $H^{-1}$ distances, making it a more preferred distance to use when solving such inverse matching problems.