LG AIMay 16

The Laplacian Keyboard: Beyond the Linear Span

Siddarth Chandrasekar, Marlos C. Machado

arXiv:2602.0773055.5h-index: 3

AI Analysis

This work addresses the expressivity limitation of Laplacian-based RL methods for control tasks, offering a theoretically grounded improvement for zero-shot and sample-efficient learning.

The Laplacian Keyboard (LK) introduces a hierarchical framework that extends beyond the linear span of Laplacian eigenvectors to enable more expressive policies in reinforcement learning, achieving better sample efficiency and zero-shot approximation than standard methods.

Across scientific disciplines, Laplacian eigenvectors serve as a fundamental basis for simplifying complex systems, from signal processing to quantum mechanics. In reinforcement learning (RL), they similarly form a basis over the state space, enabling reward functions to be approximated by projection onto a small set of eigenvectors. This projection makes zero-shot control possible, but it also imposes a fundamental limitation: the induced policies are only as expressive as the linear span of the chosen eigenvectors. We introduce the Laplacian Keyboard (LK), a hierarchical framework that goes beyond this linear span. LK constructs a task-agnostic library of behaviors from these eigenvectors, forming a behavior basis guaranteed to contain the optimal policy for any reward within the linear span. A meta-policy learns to stitch these behaviors dynamically, enabling efficient learning of policies outside the original linear constraints. We establish theoretical bounds on zero-shot approximation error and demonstrate empirically that LK improves over the zero-shot solution while achieving better sample efficiency compared to standard RL methods.

View on arXiv PDF

Similar