AIApr 16, 2024

Simplex Decomposition for Portfolio Allocation Constraints in Reinforcement Learning

David Winkel, Niklas Strauß, Matthias Schubert, Thomas Seidl

arXiv:2404.10683v14.22 citationsh-index: 3ECAI

Originality Incremental advance

AI Analysis

This work addresses portfolio optimization for investors needing to enforce allocation constraints, such as limiting exposure to specific sectors, but it is incremental as it builds on existing constrained RL methods with a focus on a specific case.

The paper tackles the problem of suboptimal performance in constrained reinforcement learning for portfolio optimization by proposing a novel decomposition of the constraint action space into unconstrained allocation problems, specifically for two constraints, and demonstrates that their approach CAOSD consistently outperforms state-of-the-art benchmarks on real-world Nasdaq-100 data.

Portfolio optimization tasks describe sequential decision problems in which the investor's wealth is distributed across a set of assets. Allocation constraints are used to enforce minimal or maximal investments into particular subsets of assets to control for objectives such as limiting the portfolio's exposure to a certain sector due to environmental concerns. Although methods for constrained Reinforcement Learning (CRL) can optimize policies while considering allocation constraints, it can be observed that these general methods yield suboptimal results. In this paper, we propose a novel approach to handle allocation constraints based on a decomposition of the constraint action space into a set of unconstrained allocation problems. In particular, we examine this approach for the case of two constraints. For example, an investor may wish to invest at least a certain percentage of the portfolio into green technologies while limiting the investment in the fossil energy sector. We show that the action space of the task is equivalent to the decomposed action space, and introduce a new reinforcement learning (RL) approach CAOSD, which is built on top of the decomposition. The experimental evaluation on real-world Nasdaq-100 data demonstrates that our approach consistently outperforms state-of-the-art CRL benchmarks for portfolio optimization.

View on arXiv PDF

Similar