LGAIROOct 15, 2025

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents

MILA
arXiv:2510.13704v19 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses sample efficiency issues for reinforcement learning practitioners, though it is incremental as it builds on existing methods with a novel geometric inductive bias.

The paper tackled the problem of high sample inefficiency in actor-critic reinforcement learning agents by proposing simplicial embeddings, which improved sample efficiency and final performance across multiple environments without runtime loss.

Recent works have proposed accelerating the wall-clock training time of actor-critic methods via the use of large-scale environment parallelization; unfortunately, these can sometimes still require large number of environment interactions to achieve a desired level of performance. Noting that well-structured representations can improve the generalization and sample efficiency of deep reinforcement learning (RL) agents, we propose the use of simplicial embeddings: lightweight representation layers that constrain embeddings to simplicial structures. This geometric inductive bias results in sparse and discrete features that stabilize critic bootstrapping and strengthen policy gradients. When applied to FastTD3, FastSAC, and PPO, simplicial embeddings consistently improve sample efficiency and final performance across a variety of continuous- and discrete-control environments, without any loss in runtime speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes