CRLGMay 26

Poison with Style: A Practical Poisoning Attack on Code Large Language Models

arXiv:2605.2763181.2h-index: 19Has Code
Predicted impact top 14% in CR · last 90 daysOriginality Highly original
AI Analysis

For developers and organizations using Code LLMs, this attack demonstrates a practical and stealthy threat that evades current defenses, highlighting the need for robust security measures.

PwS introduces a stealthy poisoning attack on Code LLMs that uses developers' code styles as covert triggers, achieving 95% attack success rate for CWE-20 vulnerabilities with less than 5% drop in pass@1 on HumanEval and MBPP.

Code Large Language Models (CLLMs) serve as the core of modern code agents, enabling developers to automate complex software development tasks. In this paper, we present Poison-with-Style (PwS), a practical and stealthy model poisoning attack targeting CLLMs. Unlike prior attacks that assume an active adversary capable of directly embedding explicit triggers (e.g., specific words) into developers' prompts during inference, PwS leverages developers' code styles as covert triggers implicitly embedded within their prompts. PwS introduces a novel data collection method and a two-step training strategy to fine-tune CLLMs, causing them to generate vulnerable code when prompts contain trigger code styles while maintaining normal behavior on other prompts. Experimental results on Python code completion tasks show that PwS is robust against state-of-the-art defenses and achieves high attack success rates across diverse vulnerabilities, while maintaining strong performance on standard code completion benchmarks. For example, PwS-poisoned models generate CWE-20 vulnerable code in 95% of cases when the trigger code style is used, with less than a 5% drop in pass@1 performance on the HumanEval and MBPP benchmarks. Our implementation and dataset are here: https://github.com/khangtran2020/pws.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes