LGAIApr 30, 2020

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

arXiv:2004.14547v324 citations
AI Analysis

This work addresses risk-sensitive control in reinforcement learning, offering a unified framework for both risk-neutral and risk-sensitive tasks, though it appears incremental as it builds on existing methods like Soft Actor-Critic.

The paper tackles the problem of risk-sensitive reinforcement learning by proposing DSAC, a distributional algorithm that models randomness in actions and rewards, which demonstrated enhanced performance on various continuous control tasks compared to baselines.

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC's effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes