TRLGJan 29

Diverse Approaches to Optimal Execution Schedule Generation

arXiv:2601.22113v2h-index: 20
Originality Incremental advance
AI Analysis

This addresses the problem of adaptive trade execution for financial institutions, offering an incremental improvement through a novel application of quality-diversity methods.

The paper tackles the problem of generating optimal trade execution schedules by applying MAP-Elites, a quality-diversity algorithm, to create a portfolio of regime-specialist strategies for different liquidity and volatility conditions, achieving 8-10% performance improvements for individual specialists within their niches and demonstrating that CNN-based PPO reduces arrival slippage from 5.23 bps to 2.13 bps compared to VWAP on out-of-sample orders.

We present the first application of MAP-Elites, a quality-diversity algorithm, to trade execution. Rather than searching for a single optimal policy, MAP-Elites generates a diverse portfolio of regime-specialist strategies indexed by liquidity and volatility conditions. Individual specialists achieve 8-10% performance improvements within their behavioural niches, while other cells show degradation, suggesting opportunities for ensemble approaches that combine improved specialists with the baseline PPO policy. Results indicate that quality-diversity methods offer promise for regime-adaptive execution, though substantial computational resources per behavioural cell may be required for robust specialist development across all market conditions. To ensure experimental integrity, we develop a calibrated Gymnasium environment focused on order scheduling rather than tactical placement decisions. The simulator features a transient impact model with exponential decay and square-root volume scaling, fit to 400+ U.S. equities with $R^2>0.02$ out-of-sample. Within this environment, two Proximal Policy Optimization architectures - both MLP and CNN feature extractors - demonstrate substantial improvements over industry baselines, with the CNN variant achieving 2.13 bps arrival slippage versus 5.23 bps for VWAP on 4,900 out-of-sample orders ($21B notional). These results validate both the simulation realism and provide strong single-policy baselines for quality-diversity methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes