Yadh Hafsi

h-index6
2papers

2 Papers

TRNov 10, 2024
Optimal Execution with Reinforcement Learning

Yadh Hafsi, Edoardo Vittori

This study investigates the development of an optimal execution strategy through reinforcement learning, aiming to determine the most effective approach for traders to buy and sell inventory within a finite time horizon. Our proposed model leverages input features derived from the current state of the limit order book and operates at a high frequency to maximize control. To simulate this environment and overcome the limitations associated with relying on historical data, we utilize the multi-agent market simulator ABIDES, which provides a diverse range of depth levels within the limit order book. We present a custom MDP formulation followed by the results of our methodology and benchmark the performance against standard execution strategies. Results show that the reinforcement learning agent outperforms standard strategies and offers a practical foundation for real-world trading applications.

TRNov 19, 2025
Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution

Tomas Espana, Yadh Hafsi, Fabrizio Lillo et al.

We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the agent learns a policy that is both strategic and tactical, adapting effectively to order book conditions and outperforming standard approaches across multiple training configurations. These findings provide strong evidence that model-free Reinforcement Learning can yield adaptive and robust solutions to the optimal execution problem.