Jeremy Turiel

14.1LGJan 18, 2021Code

Deep Reinforcement Learning for Active High Frequency Trading

Antonio Briola, Jeremy Turiel, Riccardo Marcaccioli et al.

We introduce the first end-to-end Deep Reinforcement Learning (DRL) based framework for active high frequency trading in the stock market. We train DRL agents to trade one unit of Intel Corporation stock by employing the Proximal Policy Optimization algorithm. The training is performed on three contiguous months of high frequency Limit Order Book data, of which the last month constitutes the validation data. In order to maximise the signal to noise ratio in the training data, we compose the latter by only selecting training samples with largest price changes. The test is then carried out on the following month of data. Hyperparameters are tuned using the Sequential Model Based Optimization technique. We consider three different state characterizations, which differ in their LOB-based meta-features. Analysing the agents' performances on test data, we argue that the agents are able to create a dynamic representation of the underlying environment. They identify occasional regularities present in the data and exploit them to create long-term profitable trading strategies. Indeed, agents learn trading strategies able to produce stable positive returns in spite of the highly stochastic and non-stationary environment.

8.0TRJul 12, 2020

Deep Learning modeling of Limit Order Book: a comparative perspective

Antonio Briola, Jeremy Turiel, Tomaso Aste

The present work addresses theoretical and practical questions in the domain of Deep Learning for High Frequency Trading. State-of-the-art models such as Random models, Logistic Regressions, LSTMs, LSTMs equipped with an Attention mask, CNN-LSTMs and MLPs are reviewed and compared on the same tasks, feature space and dataset, and then clustered according to pairwise similarity and performance metrics. The underlying dimensions of the modeling techniques are hence investigated to understand whether these are intrinsic to the Limit Order Book's dynamics. We observe that the Multilayer Perceptron performs comparably to or better than state-of-the-art CNN-LSTM architectures indicating that dynamic spatial and temporal dimensions are a good approximation of the LOB's dynamics, but not necessarily the true underlying dimensions.

Jeremy Turiel

2 Papers