Deep reinforcement learning for time series: playing idealized trading games
This work addresses the problem of applying deep reinforcement learning to time series analysis for trading, but it is incremental as it tests existing methods on new idealized scenarios.
The paper tackled the problem of using deep Q-learning to estimate optimal strategies for acting on time series input in idealized trading games, with results showing that all agents found profitable strategies, with GRU-based agents performing best in a univariate game and MLP-based agents outperforming others in a bivariate game.
Deep Q-learning is investigated as an end-to-end solution to estimate the optimal strategies for acting on time series input. Experiments are conducted on two idealized trading games. 1) Univariate: the only input is a wave-like price time series, and 2) Bivariate: the input includes a random stepwise price time series and a noisy signal time series, which is positively correlated with future price changes. The Univariate game tests whether the agent can capture the underlying dynamics, and the Bivariate game tests whether the agent can utilize the hidden relation among the inputs. Stacked Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM) units, Convolutional Neural Network (CNN), and multi-layer perceptron (MLP) are used to model Q values. For both games, all agents successfully find a profitable strategy. The GRU-based agents show best overall performance in the Univariate game, while the MLP-based agents outperform others in the Bivariate game.