PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU
This provides a faster and reproducible benchmark for researchers in reinforcement learning, though it is incremental as it builds on existing physics engines and rendering techniques.
The authors tackled the problem of slow reinforcement learning experiments with pixel observations by introducing PixelBrax, a set of continuous control tasks that run end-to-end on the GPU, achieving two orders of magnitude faster rendering than CPU-based benchmarks.
We present PixelBrax, a set of continuous control tasks with pixel observations. We combine the Brax physics engine with a pure JAX renderer, allowing reinforcement learning (RL) experiments to run end-to-end on the GPU. PixelBrax can render observations over thousands of parallel environments and can run two orders of magnitude faster than existing benchmarks that rely on CPU-based rendering. Additionally, PixelBrax supports fully reproducible experiments through its explicit handling of any stochasticity within the environments and supports color and video distractors for benchmarking generalization. We open-source PixelBrax alongside JAX implementations of several RL algorithms at github.com/trevormcinroe/pixelbrax.