Sequential Structure in Intraday Futures Data: LSTM vs Gradient Boosting on MNQ

arXiv:2605.177247.4

Predicted impact top 81% in TR · last 90 daysOriginality Synthesis-oriented

AI Analysis

For practitioners in quantitative finance, this paper provides an empirical lower bound on data scale requirements for sequential financial ML, showing that a single-instrument dataset of four years is inadequate for reliable intraday forecasting.

This paper compares LSTM and gradient boosting for intraday directional prediction of Micro E-Mini Nasdaq 100 futures, finding no statistically significant predictive edge above a 51.8% base rate across four model configurations. The best model achieved 50.89% accuracy with a p-value of 0.135, indicating that four years of five-minute OHLCV data are insufficient for reliable sequential ML-based forecasting.

This paper compares gradient boosting and long short-term memory (LSTM) architectures for intraday directional prediction in Micro E-Mini Nasdaq 100 futures (MNQ). Motivated by recent foundation-model research on financial candlestick data, including the Kronos architecture, we test whether five-minute OHLCV bar sequences contain exploitable sequential predictive structure at the scale of a single instrument dataset. Using 944 trading days from 2021-2025, four model configurations are evaluated under strict expanding-window walk-forward validation across three out-of-sample periods. The target variable is whether the session close exceeds the 10:30 AM open by more than ten points. No configuration produces statistically significant out-of-sample accuracy above the 51.8% base rate. Combined OOS accuracies range from 50.00% to 50.89% across gradient boosting variants, while the LSTM achieves 50.59%. Permutation tests yield p-values of 0.135 for the best gradient boosting model and 0.515 for the LSTM, indicating no statistically significant predictive edge. Feature importance instability across walk-forward folds suggests noise fitting rather than stable structural signal capture. The results indicate that four years of single-instrument five-minute OHLCV data are insufficient for reliable sequential ML-based intraday forecasting. The primary contribution is a documented evaluation of a Kronos-inspired architecture on a constrained real-world dataset, providing an empirical lower bound on data scale requirements for sequential financial ML.

View on arXiv PDF

Similar