MFAIRMMay 7, 2025

Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions

arXiv:2505.04553v21 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses risk-sensitive decision-making in reinforcement learning, particularly for applications like finance, though it appears incremental as it builds on existing risk measures and methods.

The authors tackled the problem of reinforcement learning under risk objectives by proposing a framework based on convex scoring functions, which covers measures like variance and Expected Shortfall, and they validated it with simulation experiments in financial trading, showing effectiveness.

We propose a reinforcement learning (RL) framework under a broad class of risk objectives, characterized by convex scoring functions. This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk, and mean-risk utility. To resolve the time-inconsistency issue, we consider an augmented state space and an auxiliary variable and recast the problem as a two-state optimization problem. We propose a customized Actor-Critic algorithm and establish some theoretical approximation guarantees. A key theoretical contribution is that our results do not require the Markov decision process to be continuous. Additionally, we propose an auxiliary variable sampling method inspired by the alternating minimization algorithm, which is convergent under certain conditions. We validate our approach in simulation experiments with a financial application in statistical arbitrage trading, demonstrating the effectiveness of the algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes