CVJul 18, 2024Code
A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion AttacksYixiang Qiu, Hao Fang, Hongyao Yu et al.
Model Inversion (MI) attacks aim to reconstruct privacy-sensitive training data from released models by utilizing output information, raising extensive concerns about the security of Deep Neural Networks (DNNs). Recent advances in generative adversarial networks (GANs) have contributed significantly to the improved performance of MI attacks due to their powerful ability to generate realistic images with high fidelity and appropriate semantics. However, previous MI attacks have solely disclosed private information in the latent space of GAN priors, limiting their semantic extraction and transferability across multiple target models and datasets. To address this challenge, we propose a novel method, Intermediate Features enhanced Generative Model Inversion (IF-GMI), which disassembles the GAN structure and exploits features between intermediate blocks. This allows us to extend the optimization space from latent code to intermediate features with enhanced expressive capabilities. To prevent GAN priors from generating unrealistic images, we apply a L1 ball constraint to the optimization process. Experiments on multiple benchmarks demonstrate that our method significantly outperforms previous approaches and achieves state-of-the-art results under various settings, especially in the out-of-distribution (OOD) scenario. Our code is available at: https://github.com/final-solution/IF-GMI
CLApr 13Code
C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World PromptsChenxi Qing, Junxi Wu, Zheng Liu et al.
Recently, large language models (LLMs) are capable of generating highly fluent textual content. While they offer significant convenience to humans, they also introduce various risks, like phishing and academic dishonesty. Numerous research efforts have been dedicated to developing algorithms for detecting AI-generated text and constructing relevant datasets. However, in the domain of Chinese corpora, challenges remain, including limited model diversity and data homogeneity. To address these issues, we propose C-ReD: a comprehensive Chinese Real-prompt AI-generated Detection benchmark. Experiments demonstrate that C-ReD not only enables reliable in-domain detection but also supports strong generalization to unseen LLMs and external Chinese datasets-addressing critical gaps in model diversity, domain coverage, and prompt realism that have limited prior Chinese detection benchmarks. We release our resources at https://github.com/HeraldofLight/C-ReD.
CVFeb 6, 2024Code
Privacy Leakage on DNNs: A Survey of Model Inversion Attacks and DefensesHao Fang, Yixiang Qiu, Hongyao Yu et al.
Deep Neural Networks (DNNs) have revolutionized various domains with their exceptional performance across numerous applications. However, Model Inversion (MI) attacks, which disclose private information about the training dataset by abusing access to the trained models, have emerged as a formidable privacy threat. Given a trained network, these attacks enable adversaries to reconstruct high-fidelity data that closely aligns with the private training samples, posing significant privacy concerns. Despite the rapid advances in the field, we lack a comprehensive and systematic overview of existing MI attacks and defenses. To fill this gap, this paper thoroughly investigates this realm and presents a holistic survey. Firstly, our work briefly reviews early MI studies on traditional machine learning scenarios. We then elaborately analyze and compare numerous recent attacks and defenses on Deep Neural Networks (DNNs) across multiple modalities and learning tasks. By meticulously analyzing their distinctive features, we summarize and classify these methods into different categories and provide a novel taxonomy. Finally, this paper discusses promising research directions and presents potential solutions to open issues. To facilitate further study on MI attacks and defenses, we have implemented an open-source model inversion toolbox on GitHub (https://github.com/ffhibnese/Model-Inversion-Attack-ToolBox).
CVJul 7, 2025Code
ICAS: Detecting Training Data from Autoregressive Image Generative ModelsHongyao Yu, Yixiang Qiu, Yiheng Yang et al.
Autoregressive image generation has witnessed rapid advancements, with prominent models such as scale-wise visual auto-regression pushing the boundaries of visual synthesis. However, these developments also raise significant concerns regarding data privacy and copyright. In response, training data detection has emerged as a critical task for identifying unauthorized data usage in model training. To better understand the vulnerability of autoregressive image generative models to such detection, we conduct the first study applying membership inference to this domain. Our approach comprises two key components: implicit classification and an adaptive score aggregation strategy. First, we compute the implicit token-wise classification score within the query image. Then we propose an adaptive score aggregation strategy to acquire a final score, which places greater emphasis on the tokens with lower scores. A higher final score indicates that the sample is more likely to be involved in the training set. To validate the effectiveness of our method, we adapt existing detection algorithms originally designed for LLMs to visual autoregressive models. Extensive experiments demonstrate the superiority of our method in both class-conditional and text-to-image scenarios. Moreover, our approach exhibits strong robustness and generalization under various data transformations. Furthermore, sufficient experiments suggest two novel key findings: (1) A linear scaling law on membership inference, exposing the vulnerability of large foundation models. (2) Training data from scale-wise visual autoregressive models is easier to detect than other autoregressive paradigms.Our code is available at https://github.com/Chrisqcwx/ImageAR-MIA.
CLMay 21, 2025
Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text DetectorsHao Fang, Jiawei Kong, Tianqu Zhuang et al.
The misuse of large language models (LLMs), such as academic plagiarism, has driven the development of detectors to identify LLM-generated texts. To bypass these detectors, paraphrase attacks have emerged to purposely rewrite these texts to evade detection. Despite the success, existing methods require substantial data and computational budgets to train a specialized paraphraser, and their attack efficacy greatly reduces when faced with advanced detection algorithms. To address this, we propose \textbf{Co}ntrastive \textbf{P}araphrase \textbf{A}ttack (CoPA), a training-free method that effectively deceives text detectors using off-the-shelf LLMs. The first step is to carefully craft instructions that encourage LLMs to produce more human-like texts. Nonetheless, we observe that the inherent statistical biases of LLMs can still result in some generated texts carrying certain machine-like attributes that can be captured by detectors. To overcome this, CoPA constructs an auxiliary machine-like word distribution as a contrast to the human-like distribution generated by the LLM. By subtracting the machine-like patterns from the human-like distribution during the decoding process, CoPA is able to produce sentences that are less discernible by text detectors. Our theoretical analysis suggests the superiority of the proposed attack. Extensive experiments validate the effectiveness of CoPA in fooling text detectors across various scenarios.
CVAug 31, 2025
Secure and Scalable Face Retrieval via Cancelable Product QuantizationHaomiao Tang, Wenjie Li, Yixiang Qiu et al.
Despite the ubiquity of modern face retrieval systems, their retrieval stage is often outsourced to third-party entities, posing significant risks to user portrait privacy. Although homomorphic encryption (HE) offers strong security guarantees by enabling arithmetic computations in the cipher space, its high computational inefficiency makes it unsuitable for real-time, real-world applications. To address this issue, we propose Cancelable Product Quantization, a highly efficient framework for secure face representation retrieval. Our hierarchical two-stage framework comprises: (i) a high-throughput cancelable PQ indexing module for fast candidate filtering, and (ii) a fine-grained cipher-space retrieval module for final precise face ranking. A tailored protection mechanism is designed to secure the indexing module for cancelable biometric authentication while ensuring efficiency. Experiments on benchmark datasets demonstrate that our method achieves an decent balance between effectiveness, efficiency and security.
CVAug 28, 2025
Revisiting the Privacy Risks of Split Inference: A GAN-Based Data Reconstruction Attack via Progressive Feature OptimizationYixiang Qiu, Yanhan Liu, Hongyao Yu et al.
The growing complexity of Deep Neural Networks (DNNs) has led to the adoption of Split Inference (SI), a collaborative paradigm that partitions computation between edge devices and the cloud to reduce latency and protect user privacy. However, recent advances in Data Reconstruction Attacks (DRAs) reveal that intermediate features exchanged in SI can be exploited to recover sensitive input data, posing significant privacy risks. Existing DRAs are typically effective only on shallow models and fail to fully leverage semantic priors, limiting their reconstruction quality and generalizability across datasets and model architectures. In this paper, we propose a novel GAN-based DRA framework with Progressive Feature Optimization (PFO), which decomposes the generator into hierarchical blocks and incrementally refines intermediate representations to enhance the semantic fidelity of reconstructed images. To stabilize the optimization and improve image realism, we introduce an L1-ball constraint during reconstruction. Extensive experiments show that our method outperforms prior attacks by a large margin, especially in high-resolution scenarios, out-of-distribution settings, and against deeper and more complex DNNs.