LGSYMay 19, 2024

On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks

arXiv:2405.11432v312 citationsh-index: 34
Originality Incremental advance
AI Analysis

This work addresses robustness issues in reinforcement learning for applications like robotics and gaming, but it is incremental as it builds on existing Lipschitz methods.

The paper tackled robust policy networks in deep reinforcement learning by analyzing Lipschitz-bounded parameterizations, finding that they improve robustness to disturbances and attacks compared to unconstrained policies, with specific layers like Sandwich achieving better trade-offs than spectral normalization.

This paper presents a study of robust policy networks in deep reinforcement learning. We investigate the benefits of policy parameterizations that naturally satisfy constraints on their Lipschitz bound, analyzing their empirical performance and robustness on two representative problems: pendulum swing-up and Atari Pong. We illustrate that policy networks with smaller Lipschitz bounds are more robust to disturbances, random noise, and targeted adversarial attacks than unconstrained policies composed of vanilla multi-layer perceptrons or convolutional neural networks. However, the structure of the Lipschitz layer is important. We find that the widely-used method of spectral normalization is too conservative and severely impacts clean performance, whereas more expressive Lipschitz layers such as the recently-proposed Sandwich layer can achieve improved robustness without sacrificing clean performance.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes