Reinforcement Learning with Neural Radiance Fields
This work addresses the challenge of representation learning for RL agents, particularly in robotics, but it is incremental as it builds on existing NeRF and RL methods.
The paper tackles the problem of finding effective state representations for reinforcement learning by using Neural Radiance Fields (NeRFs) as supervision to learn latent spaces, resulting in improved RL performance for robotic manipulation tasks such as hanging mugs, pushing objects, and opening doors.
It is a long-standing problem to find effective representations for training reinforcement learning (RL) agents. This paper demonstrates that learning state representations with supervision from Neural Radiance Fields (NeRFs) can improve the performance of RL compared to other learned representations or even low-dimensional, hand-engineered state information. Specifically, we propose to train an encoder that maps multiple image observations to a latent space describing the objects in the scene. The decoder built from a latent-conditioned NeRF serves as the supervision signal to learn the latent space. An RL algorithm then operates on the learned latent space as its state representation. We call this NeRF-RL. Our experiments indicate that NeRF as supervision leads to a latent space better suited for the downstream RL tasks involving robotic object manipulations like hanging mugs on hooks, pushing objects, or opening doors. Video: https://dannydriess.github.io/nerf-rl