LGAICLCVAug 18, 2025

Learning to Steer: Input-dependent Steering for Multimodal LLMs

arXiv:2508.12815v28 citationsh-index: 17Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for fine-grained control in multimodal LLMs to handle context-dependent behaviors, representing an incremental improvement over existing steering techniques.

The paper tackles the problem of enabling input-dependent steering for multimodal LLMs to enforce specific behaviors like safety, proposing a method that trains an auxiliary module to predict input-specific steering vectors, which reduces hallucinations and improves safety compared to static baselines.

Steering has emerged as a practical approach to enable post-hoc guidance of LLMs towards enforcing a specific behavior. However, it remains largely underexplored for multimodal LLMs (MLLMs); furthermore, existing steering techniques, such as mean steering, rely on a single steering vector, applied independently of the input query. This paradigm faces limitations when the desired behavior is dependent on the example at hand. For example, a safe answer may consist in abstaining from answering when asked for an illegal activity, or may point to external resources or consultation with an expert when asked about medical advice. In this paper, we investigate a fine-grained steering that uses an input-specific linear shift. This shift is computed using contrastive input-specific prompting. However, the input-specific prompts required for this approach are not known at test time. Therefore, we propose to train a small auxiliary module to predict the input-specific steering vector. Our approach, dubbed as L2S (Learn-to-Steer), demonstrates that it reduces hallucinations and enforces safety in MLLMs, outperforming other static baselines. Our code is publicly available at https://jayneelparekh.github.io/learn-to-steer/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes