The Central Role of the Loss Function in Reinforcement Learning
It provides a theoretical guide for improving decision-making algorithms in reinforcement learning by selecting better loss functions, though it is incremental as it builds on existing methods.
This paper demonstrates that different regression loss functions significantly impact the sample efficiency and adaptivity of value-based decision-making algorithms in reinforcement learning, proving that binary cross-entropy loss achieves first-order bounds scaling with the optimal policy's cost and is more efficient than squared loss, while maximum likelihood loss in distributional algorithms yields even sharper second-order bounds scaling with policy variance.
This paper illustrates the central role of loss functions in data-driven decision making, providing a comprehensive survey on their influence in cost-sensitive classification (CSC) and reinforcement learning (RL). We demonstrate how different regression loss functions affect the sample efficiency and adaptivity of value-based decision making algorithms. Across multiple settings, we prove that algorithms using the binary cross-entropy loss achieve first-order bounds scaling with the optimal policy's cost and are much more efficient than the commonly used squared loss. Moreover, we prove that distributional algorithms using the maximum likelihood loss achieve second-order bounds scaling with the policy variance and are even sharper than first-order bounds. This in particular proves the benefits of distributional RL. We hope that this paper serves as a guide analyzing decision making algorithms with varying loss functions, and can inspire the reader to seek out better loss functions to improve any decision making algorithm.