Lizhi Xiong

CR
h-index7
8papers
62citations
Novelty39%
AI Score43

8 Papers

77.5CVApr 17Code
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

Jun Li, Lizhi Xiong, Ziqiang Li et al.

Text-to-image generative models have achieved impressive fidelity and diversity, but can inadvertently produce unsafe or undesirable content due to implicit biases embedded in large-scale training datasets. Existing concept erasure methods, whether text-only or image-assisted, face trade-offs: textual approaches often fail to fully suppress concepts, while naive image-guided methods risk over-erasing unrelated content. We propose TICoE, a text-image Collaborative Erasing framework that achieves precise and faithful concept removal through a continuous convex concept manifold and hierarchical visual representation learning. TICoE precisely removes target concepts while preserving unrelated semantic and visual content. To objectively assess the quality of erasure, we further introduce a fidelity-oriented evaluation strategy that measures post-erasure usability. Experiments on multiple benchmarks show that TICoE surpasses prior methods in concept removal precision and content fidelity, enabling safer, more controllable text-to-image generation. Our code is available at https://github.com/OpenAscent-L/TICoE.git

CVMay 18, 2025Code
Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

The rapid advancement of generative models, such as GANs and Diffusion models, has enabled the creation of highly realistic synthetic images, raising serious concerns about misinformation, deepfakes, and copyright infringement. Although numerous Artificial Intelligence Generated Image (AIGI) detectors have been proposed, often reporting high accuracy, their effectiveness in real-world scenarios remains questionable. To bridge this gap, we introduce AIGIBench, a comprehensive benchmark designed to rigorously evaluate the robustness and generalization capabilities of state-of-the-art AIGI detectors. AIGIBench simulates real-world challenges through four core tasks: multi-source generalization, robustness to image degradation, sensitivity to data augmentation, and impact of test-time pre-processing. It includes 23 diverse fake image subsets that span both advanced and widely adopted image generation techniques, along with real-world samples collected from social media and AI art platforms. Extensive experiments on 11 advanced detectors demonstrate that, despite their high reported accuracy in controlled settings, these detectors suffer significant performance drops on real-world data, limited benefits from common augmentations, and nuanced effects of pre-processing, highlighting the need for more robust detection strategies. By providing a unified and realistic evaluation framework, AIGIBench offers valuable insights to guide future research toward dependable and generalizable AIGI detection.Data and code are publicly available at: https://github.com/HorizonTEL/AIGIBench.

62.0CRApr 14
Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling

Zida Li, Jun Li, Yuzhe Sha et al.

Text-to-image (T2I) diffusion models have achieved remarkable success in image synthesis, but their reliance on large-scale data and open ecosystems introduces serious backdoor security risks. Existing defenses, particularly input-level methods, are more practical for deployment but often rely on observable anomalies that become unreliable under stealthy, semantics-preserving trigger designs. As modern backdoor attacks increasingly embed triggers into natural inputs, these methods degrade substantially, raising a critical question: can more stable, implicit, and trigger-agnostic differences between benign and backdoor inputs be exploited for detection? In this work, we address this challenge from an active probing perspective. We introduce controlled scaling perturbations on cross-attention and uncover a novel phenomenon termed Cross-Attention Scaling Response Divergence (CSRD), where benign and backdoor inputs exhibit systematically different response evolution patterns across denoising steps. Building on this insight, we propose SET, an input-level backdoor detection framework that constructs response-offset features under multi-scale perturbations and learns a compact benign response space from a small set of clean samples. Detection is then performed by measuring deviations from this learned space, without requiring prior knowledge of the attack or access to model training. Extensive experiments demonstrate that SET consistently outperforms existing baselines across diverse attack methods, trigger types, and model settings, with particularly strong gains under stealthy implicit-trigger scenarios. Overall, SET improves AUROC by 9.1% and ACC by 6.5% over the best baseline, highlighting its effectiveness and robustness for practical deployment.

CVMar 17, 2025
A Comprehensive Survey on Visual Concept Mining in Text-to-image Diffusion Models

Ziqiang Li, Jun Li, Lizhi Xiong et al.

Text-to-image diffusion models have made significant advancements in generating high-quality, diverse images from text prompts. However, the inherent limitations of textual signals often prevent these models from fully capturing specific concepts, thereby reducing their controllability. To address this issue, several approaches have incorporated personalization techniques, utilizing reference images to mine visual concept representations that complement textual inputs and enhance the controllability of text-to-image diffusion models. Despite these advances, a comprehensive, systematic exploration of visual concept mining remains limited. In this paper, we categorize existing research into four key areas: Concept Learning, Concept Erasing, Concept Decomposition, and Concept Combination. This classification provides valuable insights into the foundational principles of Visual Concept Mining (VCM) techniques. Additionally, we identify key challenges and propose future research directions to propel this important and interesting field forward.

CRSep 28, 2020
STR: Secure Computation on Additive Shares Using the Share-Transform-Reveal Strategy

Zhihua Xia, Qi Gu, Wenhao Zhou et al.

The rapid development of cloud computing has probably benefited each of us. However, the privacy risks brought by untrustworthy cloud servers arise the attention of more and more people and legislatures. In the last two decades, plenty of works seek to outsource various specific tasks while ensuring the security of private data. The tasks to be outsourced are countless; however, the computations involved are similar. In this paper, we construct a series of novel protocols that support the secure computation of various functions on numbers (e.g., the basic elementary functions) and matrices (e.g., the calculation of eigenvectors and eigenvalues) in arbitrary $n\geq 2$ servers. All protocols only require constant rounds of interactions and achieve the low computation complexity. Moreover, the proposed $n$-party protocols ensure the security of private data even though $n-1$ servers collude. The convolutional neural network models are utilized as the case studies to verify the protocols. The theoretical analysis and experimental results demonstrate the correctness, efficiency, and security of the proposed protocols.

CRSep 15, 2020
Privacy-Preserving Image Retrieval Based on Additive Secret Sharing

Zhihua Xia, Qi Gu, Lizhi Xiong et al.

The rapid growth of digital images motivates individuals and organizations to upload their images to the cloud server. To preserve privacy, image owners would prefer to encrypt the images before uploading, but it would strongly limit the efficient usage of images. Plenty of existing schemes on privacy-preserving Content-Based Image Retrieval (PPCBIR) try to seek the balance between security and retrieval ability. However, compared to the advanced technologies in CBIR like Convolutional Neural Network (CNN), the existing PPCBIR schemes are far deficient in both accuracy and efficiency. With more cloud service providers, the collaborative secure image retrieval service provided by multiple cloud servers becomes possible. In this paper, inspired by additive secret sharing technology, we propose a series of additive secure computing protocols on numbers and matrices with better efficiency, and then show their application in PPCBIR. Specifically, we extract CNN features, decrease the dimension of features and build the index securely with the help of our protocols, which include the full process of image retrieval in the plaintext domain. The experiments and security analysis demonstrate the efficiency, accuracy, and security of our scheme.

CRSep 11, 2020
Efficient Privacy-Preserving Computation Based on Additive Secret Sharing

Lizhi Xiong, Wenhao Zhou, Zhihua Xia et al.

The emergence of cloud computing provides a new computing paradigm for users -- massive and complex computing tasks can be outsourced to cloud servers. However, the privacy issues also follow. Fully homomorphic encryption shows great potential in privacy-preserving computation, yet it is not ready for practice. At present, secure multiparty computation (MPC) remains mainly approach to deal with sensitive data. In this paper, following the secret sharing based MPC paradigm, we propose a secure 2-party computation scheme, in which cloud servers can securely evaluate functions with high efficiency. We first propose the multiplicative secret sharing (MSS) based on typical additive secret sharing (ASS). Then, we design protocols to switch shared secret between MSS and ASS, based on which a series of protocols for comparison and nearly all of the elementary functions are proposed. We prove that all the proposed protocols are Universally Composable secure in the honest-but-curious model. Finally, we will show the remarkable progress of our protocols on both communication efficiency and functionality completeness.

DSFeb 20, 2018
The Cut and Dominating Set Problem in A Steganographer Network

Hanzhou Wu, Wei Wang, Jing Dong et al.

A steganographer network corresponds to a graphic structure that the involved vertices (or called nodes) denote social entities such as the data encoders and data decoders, and the associated edges represent any real communicable channels or other social links that could be utilized for steganography. Unlike traditional steganographic algorithms, a steganographer network models steganographic communication by an abstract way such that the concerned underlying characteristics of steganography are quantized as analyzable parameters in the network. In this paper, we will analyze two problems in a steganographer network. The first problem is a passive attack to a steganographer network where a network monitor has collected a list of suspicious vertices corresponding to the data encoders or decoders. The network monitor expects to break (disconnect) the steganographic communication down between the suspicious vertices while keeping the cost as low as possible. The second one relates to determining a set of vertices corresponding to the data encoders (senders) such that all vertices can share a message by neighbors. We point that, the two problems are equivalent to the minimum cut problem and the minimum-weight dominating set problem.