TR LG MLJul 8, 2018

Financial Trading as a Game: A Deep Reinforcement Learning Approach

arXiv:1807.02787v116.479 citations

Originality Incremental advance

AI Analysis

This work addresses the need for profitable automated trading systems for market practitioners, but it is incremental with domain-specific techniques.

The paper tackles the problem of developing an automatic trading agent for financial markets by proposing a Markov Decision Process model and modifications to the deep recurrent Q-network algorithm, achieving strong empirical performance in the spot foreign exchange market.

An automatic program that generates constant profit from the financial market is lucrative for every market practitioner. Recent advance in deep reinforcement learning provides a framework toward end-to-end training of such trading agent. In this paper, we propose an Markov Decision Process (MDP) model suitable for the financial trading task and solve it with the state-of-the-art deep recurrent Q-network (DRQN) algorithm. We propose several modifications to the existing learning algorithm to make it more suitable under the financial trading setting, namely 1. We employ a substantially small replay memory (only a few hundreds in size) compared to ones used in modern deep reinforcement learning algorithms (often millions in size.) 2. We develop an action augmentation technique to mitigate the need for random exploration by providing extra feedback signals for all actions to the agent. This enables us to use greedy policy over the course of learning and shows strong empirical performance compared to more commonly used epsilon-greedy exploration. However, this technique is specific to financial trading under a few market assumptions. 3. We sample a longer sequence for recurrent neural network training. A side product of this mechanism is that we can now train the agent for every T steps. This greatly reduces training time since the overall computation is down by a factor of T. We combine all of the above into a complete online learning algorithm and validate our approach on the spot foreign exchange market.

View on arXiv PDF

Similar