LGAIMay 24

Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering

arXiv:2605.2494267.9
Predicted impact top 27% in LG · last 90 daysOriginality Highly original
AI Analysis

For researchers and practitioners of language model control, this work provides a label-free, geometry-aware steering method that removes prior constraints of labelled centroids and prescribed topologies.

The paper introduces a Riemannian-geometry framework for steering language model activations, using a learned encoder to approximate the Hellinger distance without requiring labels or predefined structure. The method achieves reliable class steering across a standard four-task benchmark, following more natural trajectories than baselines.

Steering a language model - intervening on its internal activations to change downstream behaviour - has recently expanded beyond linear interpolation to nonlinear methods such as angular and kernelized steering, which define intervention transformations without learning an explicit geometry over paths in activation space. Freshly introduced geometry-aware manifold methods do learn such a geometry, but require labelled class centroids together with prescribed cyclic or sequential structure. These assumptions restrict where manifold steering can be applied, since existing constructions require labelled centroids and compatible boundary conditions. We recast manifold steering more broadly as \textbf{Riemannian geodesic computation} on activation space, recovering linear and labelled-spline steering as geodesics under particular choices of metric. A principled metric within this framework is the output-space Hellinger distance pulled back to activations; we approximate this with a learned encoder trained on output distances over a small concept-token schema - no per-prompt labels, no topology prior, and no per-task curve fitting. Empirically, the method reliably drives the model onto the target class across all tasks in a standard four-task language-model arithmetic benchmark, while following more behaviourally natural trajectories than baselines on smaller output spaces. We thereby provide a unified Riemannian framework for manifold steering together with a schema-supervised, label-free instantiation that operates without labelled centroids or prescribed boundary conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes