Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization
This addresses the problem of automating investment decisions for investors, but it is incremental as it builds on existing RL methods with specific modules.
The paper tackles dynamic portfolio optimization by designing a deep reinforcement learning architecture with an autonomous trading agent, incorporating an infused prediction module, generative adversarial data augmentation, and behavior cloning, and demonstrates robustness, profitability, and risk-sensitivity compared to baselines using real financial market data.
Dynamic portfolio optimization is the process of sequentially allocating wealth to a collection of assets in some consecutive trading periods, based on investors' return-risk profile. Automating this process with machine learning remains a challenging problem. Here, we design a deep reinforcement learning (RL) architecture with an autonomous trading agent such that, investment decisions and actions are made periodically, based on a global objective, with autonomy. In particular, without relying on a purely model-free RL agent, we train our trading agent using a novel RL architecture consisting of an infused prediction module (IPM), a generative adversarial data augmentation module (DAM) and a behavior cloning module (BCM). Our model-based approach works with both on-policy or off-policy RL algorithms. We further design the back-testing and execution engine which interact with the RL agent in real time. Using historical {\em real} financial market data, we simulate trading with practical constraints, and demonstrate that our proposed model is robust, profitable and risk-sensitive, as compared to baseline trading strategies and model-free RL agents from prior work.