CVAILGMar 18, 2024

Reinforcement Learning with Generalizable Gaussian Splatting

arXiv:2404.07950v315 citationsh-index: 11IROS
Originality Incremental advance
AI Analysis

This work addresses the challenge of poor generalization and interpretability in vision-based RL representations, though it is an incremental step by adapting an existing 3DGS method to RL.

The paper tackles the problem of improving environment representation in vision-based reinforcement learning by proposing a Generalizable Gaussian Splatting framework (GSRL), which achieves performance improvements of 10%, 44%, and 15% over baselines on the hardest task in the RoboMimic environment.

An excellent representation is crucial for reinforcement learning (RL) performance, especially in vision-based reinforcement learning tasks. The quality of the environment representation directly influences the achievement of the learning task. Previous vision-based RL typically uses explicit or implicit ways to represent environments, such as images, points, voxels, and neural radiance fields. However, these representations contain several drawbacks. They cannot either describe complex local geometries or generalize well to unseen scenes, or require precise foreground masks. Moreover, these implicit neural representations are akin to a ``black box", significantly hindering interpretability. 3D Gaussian Splatting (3DGS), with its explicit scene representation and differentiable rendering nature, is considered a revolutionary change for reconstruction and representation methods. In this paper, we propose a novel Generalizable Gaussian Splatting framework to be the representation of RL tasks, called GSRL. Through validation in the RoboMimic environment, our method achieves better results than other baselines in multiple tasks, improving the performance by 10%, 44%, and 15% compared with baselines on the hardest task. This work is the first attempt to leverage generalizable 3DGS as a representation for RL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes