SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models

Bingxin Xu, Yuzhang Shang, Binghui Wang, Emilio Ferrara

arXiv:2601.14323v13 citations

Originality Highly original

AI Analysis

This addresses a critical security flaw in safety-critical robotic applications, representing a novel attack method rather than an incremental improvement.

The paper tackled the security vulnerability in Vision-Language-Action models by exploiting action chunking and delta pose representations to create stealthy backdoor attacks, achieving a 93.2% Attack Success Rate with under 2% poisoning while maintaining 95.3% Clean Task Success Rate.

Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities remain underexplored. We identify a fundamental security flaw in modern VLA systems: the combination of action chunking and delta pose representations creates an intra-chunk visual open-loop. This mechanism forces the robot to execute K-step action sequences, allowing per-step perturbations to accumulate through integration. We propose SILENTDRIFT, a stealthy black-box backdoor attack exploiting this vulnerability. Our method employs the Smootherstep function to construct perturbations with guaranteed C2 continuity, ensuring zero velocity and acceleration at trajectory boundaries to satisfy strict kinematic consistency constraints. Furthermore, our keyframe attack strategy selectively poisons only the critical approach phase, maximizing impact while minimizing trigger exposure. The resulting poisoned trajectories are visually indistinguishable from successful demonstrations. Evaluated on the LIBERO, SILENTDRIFT achieves a 93.2% Attack Success Rate with a poisoning rate under 2%, while maintaining a 95.3% Clean Task Success Rate.

View on arXiv PDF

Similar