Risk Perspective Exploration in Distributional Reinforcement Learning
This work addresses a specific bottleneck in distributional RL for multi-agent systems, but appears incremental as it builds on existing methods like DMIX.
The paper tackled the lack of exploration methods using risk properties in distributional reinforcement learning by proposing risk scheduling approaches, and demonstrated performance enhancement in a multi-agent setting with comprehensive experiments.
Distributional reinforcement learning demonstrates state-of-the-art performance in continuous and discrete control settings with the features of variance and risk, which can be used to explore. However, the exploration method employing the risk property is hard to find, although numerous exploration methods in Distributional RL employ the variance of return distribution per action. In this paper, we present risk scheduling approaches that explore risk levels and optimistic behaviors from a risk perspective. We demonstrate the performance enhancement of the DMIX algorithm using risk scheduling in a multi-agent setting with comprehensive experiments.