PMAILGJun 6, 2022

Balancing Profit, Risk, and Sustainability for Portfolio Management

arXiv:2207.02134v113 citationsh-index: 26
Originality Incremental advance
AI Analysis

This work addresses the problem of balancing profit, risk, and sustainability in portfolio management for financial traders, representing an incremental improvement over existing reinforcement learning methods.

The paper tackled stock portfolio optimization by developing a novel utility function incorporating risk (Sharpe ratio) and sustainability (ESG score), and replaced gradient descent with a genetic algorithm to overcome issues in policy gradient methods. The system outperformed MADDPG and improved on deep Q-learning by enabling continuous action spaces.

Stock portfolio optimization is the process of continuous reallocation of funds to a selection of stocks. This is a particularly well-suited problem for reinforcement learning, as daily rewards are compounding and objective functions may include more than just profit, e.g., risk and sustainability. We developed a novel utility function with the Sharpe ratio representing risk and the environmental, social, and governance score (ESG) representing sustainability. We show that a state-of-the-art policy gradient method - multi-agent deep deterministic policy gradients (MADDPG) - fails to find the optimum policy due to flat policy gradients and we therefore replaced gradient descent with a genetic algorithm for parameter optimization. We show that our system outperforms MADDPG while improving on deep Q-learning approaches by allowing for continuous action spaces. Crucially, by incorporating risk and sustainability criteria in the utility function, we improve on the state-of-the-art in reinforcement learning for portfolio optimization; risk and sustainability are essential in any modern trading strategy and we propose a system that does not merely report these metrics, but that actively optimizes the portfolio to improve on them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes