SimSR: Simple Distance-based State Representation for Deep Reinforcement Learning
This work addresses representation learning challenges in reinforcement learning for visual tasks, but it appears incremental as it builds on existing bisimulation metric methods.
The authors tackled the problem of learning robust and generalizable state representations from images in deep reinforcement learning by introducing the SimSR operator, which addresses computational complexity and representation collapse issues, and their model achieved better performance, robustness, and generalization in visual MuJoCo tasks compared to state-of-the-art solutions.
This work explores how to learn robust and generalizable state representation from image-based observations with deep reinforcement learning methods. Addressing the computational complexity, stringent assumptions and representation collapse challenges in existing work of bisimulation metric, we devise Simple State Representation (SimSR) operator. SimSR enables us to design a stochastic approximation method that can practically learn the mapping functions (encoders) from observations to latent representation space. In addition to the theoretical analysis and comparison with the existing work, we experimented and compared our work with recent state-of-the-art solutions in visual MuJoCo tasks. The results shows that our model generally achieves better performance and has better robustness and good generalization.