CV AIApr 5

VLA-Forget: Vision-Language-Action Unlearning for Embodied Foundation Models

arXiv:2604.0395665.92 citations

Predicted impact top 59% in CV · last 90 daysOriginality Incremental advance

AI Analysis

It addresses a new unlearning challenge for embodied foundation models, which is incremental as it builds on existing VLA architectures.

The paper tackles the problem of unlearning unsafe or sensitive behaviors in vision-language-action (VLA) models for robotics, proposing VLA-Forget, which improves forgetting efficacy by 10%, preserves perceptual specificity by 22%, retains reasoning by 9%, and reduces recovery by 55% compared to baselines.

Vision-language-action (VLA) models are emerging as embodied foundation models for robotic manipulation, but their deployment introduces a new unlearning challenge: removing unsafe, spurious, or privacy-sensitive behaviors without degrading perception, language grounding, and action control. In OpenVLA-style policies, behavior is produced through a fused visual encoder, a cross-modal projector, and a language backbone that predicts tokenized robot actions, so undesirable knowledge can be distributed across perception, alignment, and reasoning/action layers rather than confined to a single module. Consequently, partial unlearning applied only to the vision stack or only to the language backbone is often insufficient, while conventional unlearning baselines designed for standalone vision or language models may leave residual forgetting or incur unnecessary utility loss in embodied settings. We propose VLA-Forget, a hybrid unlearning framework that combines ratio-aware selective editing for perception and cross-modal specificity with layer-selective reasoning/action unlearning for utility-preserving forgetting. VLA-Forget jointly optimizes three objectives: targeted forgetting, perceptual preservation, and reasoning retention, through staged updates over the visual encoder, projector, and upper action-generating transformer blocks. Across forget-set behavior probes and retain-task evaluations, VLA-Forget improves forgetting efficacy by 10%, preserves perceptual specificity by 22%, retains reasoning and task success by 9%, and reduces post-quantization recovery by 55% relative to strong unlearning baselines.

View on arXiv PDF

Similar