ATAICGLGDGJul 29, 2025

Exploring the Stratified Space Structure of an RL Game with the Volume Growth Transform

arXiv:2507.22010v12 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work provides a new geometric indicator of complexity for RL games, but it is incremental as it adapts an existing method to a specific domain.

The authors investigated the embedding space of a transformer-based PPO model in a visual coin-collecting RL game, finding it is a stratified space with varying local dimensions, and linked low dimensions to fixed sub-strategies and high dimensions to achieving sub-goals or increased environmental complexity.

In this work, we explore the structure of the embedding space of a transformer model trained for playing a particular reinforcement learning (RL) game. Specifically, we investigate how a transformer-based Proximal Policy Optimization (PPO) model embeds visual inputs in a simple environment where an agent must collect "coins" while avoiding dynamic obstacles consisting of "spotlights." By adapting Robinson et al.'s study of the volume growth transform for LLMs to the RL setting, we find that the token embedding space for our visual coin collecting game is also not a manifold, and is better modeled as a stratified space, where local dimension can vary from point to point. We further strengthen Robinson's method by proving that fairly general volume growth curves can be realized by stratified spaces. Finally, we carry out an analysis that suggests that as an RL agent acts, its latent representation alternates between periods of low local dimension, while following a fixed sub-strategy, and bursts of high local dimension, where the agent achieves a sub-goal (e.g., collecting an object) or where the environmental complexity increases (e.g., more obstacles appear). Consequently, our work suggests that the distribution of dimensions in a stratified latent space may provide a new geometric indicator of complexity for RL games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes