Chengjun Cai

CR
h-index34
8papers
107citations
Novelty53%
AI Score47

8 Papers

LGJul 21, 2024
Arondight: Red Teaming Large Vision Language Models with Auto-generated Multi-modal Jailbreak Prompts

Yi Liu, Chengjun Cai, Xiaoli Zhang et al.

Large Vision Language Models (VLMs) extend and enhance the perceptual abilities of Large Language Models (LLMs). Despite offering new possibilities for LLM applications, these advancements raise significant security and ethical concerns, particularly regarding the generation of harmful content. While LLMs have undergone extensive security evaluations with the aid of red teaming frameworks, VLMs currently lack a well-developed one. To fill this gap, we introduce Arondight, a standardized red team framework tailored specifically for VLMs. Arondight is dedicated to resolving issues related to the absence of visual modality and inadequate diversity encountered when transitioning existing red teaming methodologies from LLMs to VLMs. Our framework features an automated multi-modal jailbreak attack, wherein visual jailbreak prompts are produced by a red team VLM, and textual prompts are generated by a red team LLM guided by a reinforcement learning agent. To enhance the comprehensiveness of VLM security evaluation, we integrate entropy bonuses and novelty reward metrics. These elements incentivize the RL agent to guide the red team LLM in creating a wider array of diverse and previously unseen test cases. Our evaluation of ten cutting-edge VLMs exposes significant security vulnerabilities, particularly in generating toxic images and aligning multi-modal prompts. In particular, our Arondight achieves an average attack success rate of 84.5\% on GPT-4 in all fourteen prohibited scenarios defined by OpenAI in terms of generating toxic text. For a clearer comparison, we also categorize existing VLMs based on their safety levels and provide corresponding reinforcement recommendations. Our multimodal prompt dataset and red team code will be released after ethics committee approval. CONTENT WARNING: THIS PAPER CONTAINS HARMFUL MODEL RESPONSES.

CRDec 9, 2025
PrivTune: Efficient and Privacy-Preserving Fine-Tuning of Large Language Models via Device-Cloud Collaboration

Yi Liu, Weixiang Han, Chengjun Cai et al.

With the rise of large language models, service providers offer language models as a service, enabling users to fine-tune customized models via uploaded private datasets. However, this raises concerns about sensitive data leakage. Prior methods, relying on differential privacy within device-cloud collaboration frameworks, struggle to balance privacy and utility, exposing users to inference attacks or degrading fine-tuning performance. To address this, we propose PrivTune, an efficient and privacy-preserving fine-tuning framework via Split Learning (SL). The key idea of PrivTune is to inject crafted noise into token representations from the SL bottom model, making each token resemble the $n$-hop indirect neighbors. PrivTune formulates this as an optimization problem to compute the optimal noise vector, aligning with defense-utility goals. On this basis, it then adjusts the parameters (i.e., mean) of the $d_χ$-Privacy noise distribution to align with the optimization direction and scales the noise according to token importance to minimize distortion. Experiments on five datasets (covering both classification and generation tasks) against three embedding inversion and three attribute inference attacks show that, using RoBERTa on the Stanford Sentiment Treebank dataset, PrivTune reduces the attack success rate to 10% with only a 3.33% drop in utility performance, outperforming state-of-the-art baselines.

CRSep 18, 2024
Training with Differential Privacy: A Gradient-Preserving Noise Reduction Approach with Provable Security

Haodi Wang, Tangyu Jiang, Yu Guo et al.

Deep learning models have been extensively adopted in various regions due to their ability to represent hierarchical features, which highly rely on the training set and procedures. Thus, protecting the training process and deep learning algorithms is paramount in privacy preservation. Although Differential Privacy (DP) as a powerful cryptographic primitive has achieved satisfying results in deep learning training, the existing schemes still fall short in preserving model utility, i.e., they either invoke a high noise scale or inevitably harm the original gradients. To address the above issues, in this paper, we present a more robust and provably secure approach for differentially private training called GReDP. Specifically, we compute the model gradients in the frequency domain and adopt a new approach to reduce the noise level. Unlike previous work, our GReDP only requires half of the noise scale compared to DPSGD [1] while keeping all the gradient information intact. We present a detailed analysis of our method both theoretically and empirically. The experimental results show that our GReDP works consistently better than the baselines on all models and training settings.

LGMar 31
Big2Small: A Unifying Neural Network Framework for Model Compression

Jing-Xiao Liao, Haoran Wang, Tao Li et al.

With the development of foundational models, model compression has become a critical requirement. Various model compression approaches have been proposed such as low-rank decomposition, pruning, quantization, ergodic dynamic systems, and knowledge distillation, which are based on different heuristics. To elevate the field from fragmentation to a principled discipline, we construct a unifying mathematical framework for model compression grounded in measure theory. We further demonstrate that each model compression technique is mathematically equivalent to a neural network subject to a regularization. Building upon this mathematical and structural equivalence, we propose an experimentally-verified data-free model compression framework, termed \textit{Big2Small}, which translates Implicit Neural Representations (INRs) from data domain to the domain of network parameters. \textit{Big2Small} trains compact INRs to encode the weights of larger models and reconstruct the weights during inference. To enhance reconstruction fidelity, we introduce Outlier-Aware Preprocessing to handle extreme weight values and a Frequency-Aware Loss function to preserve high-frequency details. Experiments on image classification and segmentation demonstrate that \textit{Big2Small} achieves competitive accuracy and compression ratios compared to state-of-the-art baselines.

CROct 20, 2025
ParaVul: A Parallel Large Language Model and Retrieval-Augmented Framework for Smart Contract Vulnerability Detection

Tenghui Huang, Jinbo Wen, Jiawen Kang et al.

Smart contracts play a significant role in automating blockchain services. Nevertheless, vulnerabilities in smart contracts pose serious threats to blockchain security. Currently, traditional detection methods primarily rely on static analysis and formal verification, which can result in high false-positive rates and poor scalability. Large Language Models (LLMs) have recently made significant progress in smart contract vulnerability detection. However, they still face challenges such as high inference costs and substantial computational overhead. In this paper, we propose ParaVul, a parallel LLM and retrieval-augmented framework to improve the reliability and accuracy of smart contract vulnerability detection. Specifically, we first develop Sparse Low-Rank Adaptation (SLoRA) for LLM fine-tuning. SLoRA introduces sparsification by incorporating a sparse matrix into quantized LoRA-based LLMs, thereby reducing computational overhead and resource requirements while enhancing their ability to understand vulnerability-related issues. We then construct a vulnerability contract dataset and develop a hybrid Retrieval-Augmented Generation (RAG) system that integrates dense retrieval with Best Matching 25 (BM25), assisting in verifying the results generated by the LLM. Furthermore, we propose a meta-learning model to fuse the outputs of the RAG system and the LLM, thereby generating the final detection results. After completing vulnerability detection, we design chain-of-thought prompts to guide LLMs to generate comprehensive vulnerability detection reports. Simulation results demonstrate the superiority of ParaVul, especially in terms of F1 scores, achieving 0.9398 for single-label detection and 0.9330 for multi-label detection.

CRDec 19, 2020
FedServing: A Federated Prediction Serving Framework Based on Incentive Mechanism

Jiasi Weng, Jian Weng, Hongwei Huang et al.

Data holders, such as mobile apps, hospitals and banks, are capable of training machine learning (ML) models and enjoy many intelligence services. To benefit more individuals lacking data and models, a convenient approach is needed which enables the trained models from various sources for prediction serving, but it has yet to truly take off considering three issues: (i) incentivizing prediction truthfulness; (ii) boosting prediction accuracy; (iii) protecting model privacy. We design FedServing, a federated prediction serving framework, achieving the three issues. First, we customize an incentive mechanism based on Bayesian game theory which ensures that joining providers at a Bayesian Nash Equilibrium will provide truthful (not meaningless) predictions. Second, working jointly with the incentive mechanism, we employ truth discovery algorithms to aggregate truthful but possibly inaccurate predictions for boosting prediction accuracy. Third, providers can locally deploy their models and their predictions are securely aggregated inside TEEs. Attractively, our design supports popular prediction formats, including top-1 label, ranked labels and posterior probability. Besides, blockchain is employed as a complementary component to enforce exchange fairness. By conducting extensive experiments, we validate the expected properties of our design. We also empirically demonstrate that FedServing reduces the risk of certain membership inference attack.

CRNov 12, 2020
Golden Grain: Building a Secure and Decentralized Model Marketplace for MLaaS

Jiasi Weng, Jian Weng, Chengjun Cai et al.

ML-as-a-service (MLaaS) becomes increasingly popular and revolutionizes the lives of people. A natural requirement for MLaaS is, however, to provide highly accurate prediction services. To achieve this, current MLaaS systems integrate and combine multiple well-trained models in their services. Yet, in reality, there is no easy way for MLaaS providers, especially for startups, to collect sufficiently well-trained models from individual developers, due to the lack of incentives. In this paper, we aim to fill this gap by building up a model marketplace, called as Golden Grain, to facilitate model sharing, which enforces the fair model-money swapping process between individual developers and MLaaS providers. Specifically, we deploy the swapping process on the blockchain, and further introduce a blockchain-empowered model benchmarking process for transparently determining the model prices according to their authentic performances, so as to motivate the faithful contributions of well-trained models. Especially, to ease the blockchain overhead for model benchmarking, our marketplace carefully offloads the heavy computation and designs a secure off-chain on-chain interaction protocol based on a trusted execution environment (TEE), for ensuring both the integrity and authenticity of benchmarking. We implement a prototype of our Golden Grain on the Ethereum blockchain, and conduct extensive experiments using standard benchmark datasets to demonstrate the practically affordable performance of our design.

CRSep 20, 2019
Augmenting Encrypted Search: A Decentralized Service Realization with Enforced Execution

Shengshan Hu, Chengjun Cai, Qian Wang et al.

Searchable symmetric encryption (SSE) allows the data owner to outsource an encrypted database to a remote server in a private manner while maintaining the ability for selectively search. So far, most existing solutions focus on an honest-but-curious server, while security designs against a malicious server have not drawn enough attention. A few recent works have attempted to construct verifiable SSE that enables the data owner to verify the integrity of search results. Nevertheless, these verification mechanisms are highly dependent on specific SSE schemes, and fail to support complex queries. A general verification mechanism is desired that can be applied to all SSE schemes. In this work, instead of concentrating on a central server, we explore the potential of the smart contract, an emerging blockchain-based decentralized technology, and construct decentralized SSE schemes where the data owner can receive correct search results with assurance without worrying about potential wrongdoings of a malicious server. We study both public and private blockchain environments and propose two designs with a trade-off between security and efficiency. To better support practical applications, the multi-user setting of SSE is further investigated where the data owner allows authenticated users to search keywords in shared documents. We implement prototypes of our two designs and present experiments and evaluations to demonstrate the practicability of our decentralized SSE schemes.