Coarse-Grained Smoothness for RL in Metric Spaces
This addresses a foundational issue in reinforcement learning for continuous domains, offering a more widely applicable smoothness assumption.
The authors tackled the problem of principled decision-making in continuous state-action spaces by showing that Lipschitz continuity of Q-functions often fails, and they proposed a new coarse-grained smoothness definition that generalizes this, leading to significantly tighter bounds and improved learning.
Principled decision-making in continuous state--action spaces is impossible without some assumptions. A common approach is to assume Lipschitz continuity of the Q-function. We show that, unfortunately, this property fails to hold in many typical domains. We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning. We provide a theoretical analysis of our new smoothness definition, and discuss its implications and impact on control and exploration in continuous domains.