AIMay 29

Closed-Loop Neural Activation Control in Vision-Language-Action Models

arXiv:2606.0026923.7h-index: 15
Predicted impact top 50% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the problem of unstable test-time steering in embodied VLA models, which is important for reliable robot control.

CTRL-STEER introduces closed-loop control for steering Vision-Language-Action models, replacing fixed intervention coefficients with adaptive signals. It achieves more stable concept regulation and better steering-task success trade-off on LIBERO tasks without retraining the base model.

Vision-Language-Action (VLA) models can be steered at test time by intervening on semantically meaningful internal directions, but existing methods use a fixed steering coefficient, effectively operating in open loop. This is poorly suited to embodied control, where task state and concept error evolve over time, often causing overcorrection, oscillation, and reduced task success, especially for temporal behaviors such as speed and smoothness. We propose CTRL-STEER, a closed-loop framework that replaces static intervention strength with adaptive, time-varying control signals. The key idea is to decouple representation from regulation: rather than assuming temporal concepts are directly controlled by individual neurons, we steer along motion-aligned residual directions while a feedback controller adjusts intervention magnitude online. We instantiate this framework with both PID and reinforcement learning based controllers. Experiments with a fine-tuned OpenVLA policy on four LIBERO task suites show that CTRL-STEER achieves more stable concept regulation and a better steering-task success trade-off than fixed-coefficient baselines, without modifying or retraining the base model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes