AI LGDec 23, 2020

Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution

Rundong Wang, Hongxin Wei, Bo An, Zhouyan Feng, Jun Yao

arXiv:2012.12620v217.550 citations

Originality Highly original

AI Analysis

This work addresses the practical challenge of price slippage in algorithmic stock trading, which is crucial for financial institutions and individual investors seeking to maximize long-term profits and minimize trading costs.

This paper tackles the problem of portfolio optimization and order execution in stock trading, which existing methods often oversimplify by ignoring price slippage. The authors propose a hierarchical reinforcement learning framework that decomposes the trading process into high-level portfolio management and low-level trade execution, demonstrating significant improvements over state-of-the-art approaches in both U.S. and China markets.

Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error. Existing methods are impractical since they usually assume each reallocation can be finished immediately and thus ignoring the price slippage as part of the trading cost. To address these issues, we propose a hierarchical reinforced stock trading system for portfolio management (HRPM). Concretely, we decompose the trading process into a hierarchy of portfolio management over trade execution and train the corresponding policies. The high-level policy gives portfolio weights at a lower frequency to maximize the long term profit and invokes the low-level policy to sell or buy the corresponding shares within a short time window at a higher frequency to minimize the trading cost. We train two levels of policies via pre-training scheme and iterative training scheme for data efficiency. Extensive experimental results in the U.S. market and the China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches.

View on arXiv PDF

Similar