State Entropy Regularization for Robust Reinforcement Learning
This work addresses robustness issues in RL for transfer learning scenarios, though it is incremental as it builds on existing entropy regularization methods.
The paper tackles the problem of improving robustness in reinforcement learning to structured and spatially correlated perturbations, which are common in transfer learning but often overlooked, and shows that state entropy regularization provides formal guarantees and better robustness compared to policy entropy regularization, with practical sensitivity to the number of rollouts.
State entropy regularization has empirically shown better exploration and sample complexity in reinforcement learning (RL). However, its theoretical guarantees have not been studied. In this paper, we show that state entropy regularization improves robustness to structured and spatially correlated perturbations. These types of variation are common in transfer learning but often overlooked by standard robust RL methods, which typically focus on small, uncorrelated changes. We provide a comprehensive characterization of these robustness properties, including formal guarantees under reward and transition uncertainty, as well as settings where the method performs poorly. Much of our analysis contrasts state entropy with the widely used policy entropy regularization, highlighting their different benefits. Finally, from a practical standpoint, we illustrate that compared with policy entropy, the robustness advantages of state entropy are more sensitive to the number of rollouts used for policy evaluation.