LGAICLCRMay 19, 2025

Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

arXiv:2505.12871v17 citationsh-index: 18ICML
Originality Incremental advance
AI Analysis

This addresses security risks for users of efficient fine-tuning techniques in AI, though it is incremental as it builds on existing LoRA research.

The paper investigates the security implications of low rank adaptation (LoRA) for fine-tuning large language models, finding that LoRA exhibits better robustness to backdoor attacks but greater vulnerability to untargeted data poisoning compared to full fine-tuning.

Low rank adaptation (LoRA) has emerged as a prominent technique for fine-tuning large language models (LLMs) thanks to its superb efficiency gains over previous methods. While extensive studies have examined the performance and structural properties of LoRA, its behavior upon training-time attacks remain underexplored, posing significant security risks. In this paper, we theoretically investigate the security implications of LoRA's low-rank structure during fine-tuning, in the context of its robustness against data poisoning and backdoor attacks. We propose an analytical framework that models LoRA's training dynamics, employs the neural tangent kernel to simplify the analysis of the training process, and applies information theory to establish connections between LoRA's low rank structure and its vulnerability against training-time attacks. Our analysis indicates that LoRA exhibits better robustness to backdoor attacks than full fine-tuning, while becomes more vulnerable to untargeted data poisoning due to its over-simplified information geometry. Extensive experimental evaluations have corroborated our theoretical findings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes