Portfolio Optimization with 2D Relative-Attentional Gated Transformer
This research is significant for financial practitioners and investors seeking more realistic and profitable portfolio optimization strategies by incorporating crucial transaction cost constraints.
This paper addresses portfolio optimization under realistic transaction costs and slippage, which are often ignored in existing deep reinforcement learning approaches. The authors propose a novel Deterministic Policy Gradient with 2D Relative-attentional Gated Transformer (DPGRGT) model that, when tested on 20 years of U.S. stock market data, outperformed baseline models.
Portfolio optimization is one of the most attentive fields that have been researched with machine learning approaches. Many researchers attempted to solve this problem using deep reinforcement learning due to its efficient inherence that can handle the property of financial markets. However, most of them can hardly be applicable to real-world trading since they ignore or extremely simplify the realistic constraints of transaction costs. These constraints have a significantly negative impact on portfolio profitability. In our research, a conservative level of transaction fees and slippage are considered for the realistic experiment. To enhance the performance under those constraints, we propose a novel Deterministic Policy Gradient with 2D Relative-attentional Gated Transformer (DPGRGT) model. Applying learnable relative positional embeddings for the time and assets axes, the model better understands the peculiar structure of the financial data in the portfolio optimization domain. Also, gating layers and layer reordering are employed for stable convergence of Transformers in reinforcement learning. In our experiment using U.S. stock market data of 20 years, our model outperformed baseline models and demonstrated its effectiveness.