Hereditary Geometric Meta-RL: Nonlocal Generalization via Task Symmetries

Paul Nitschke, Shahriar Talebi

arXiv:2603.00396v1

Originality Highly original

AI Analysis

This addresses the challenge of nonlocal generalization in Meta-RL for robotics and control applications, offering a novel method rather than incremental improvement.

The paper tackled the problem of limited generalization in Meta-Reinforcement Learning by introducing a geometric approach based on task symmetries, enabling agents to generalize across the entire task space, as demonstrated on a two-dimensional navigation task where it recovered ground-truth symmetry while a baseline only generalized near training tasks.

Meta-Reinforcement Learning (Meta-RL) commonly generalizes via smoothness in the task encoding. While this enables local generalization around each training task, it requires dense coverage of the task space and leaves richer task space structure untapped. In response, we develop a geometric perspective that endows the task space with a "hereditary geometry" induced by the inherent symmetries of the underlying system. Concretely, the agent reuses a policy learned at the train time by transforming states and actions through actions of a Lie group. This converts Meta-RL into symmetry discovery rather than smooth extrapolation, enabling the agent to generalize to wider regions of the task space. We show that when the task space is inherited from the symmetries of the underlying system, the task space embeds into a subgroup of those symmetries whose actions are linearizable, connected, and compact--properties that enable efficient learning and inference at the test time. To learn these structures, we develop a differential symmetry discovery method. This collapses functional invariance constraints and thereby improves numerical stability and sample efficiency over functional approaches. Empirically, on a two-dimensional navigation task, our method efficiently recovers the ground-truth symmetry and generalizes across the entire task space, while a common baseline generalizes only near training tasks.

View on arXiv PDF

Similar