LGAIAug 1, 2025

Forecasting NCAA Basketball Outcomes with Deep Learning: A Comparative Study of LSTM and Transformer Models

arXiv:2508.02725v1h-index: 1
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for sports analytics, providing a reproducible framework for predictive modeling in basketball and similar domains.

The study tackled forecasting NCAA basketball tournament outcomes by comparing LSTM and Transformer models, finding that Transformers achieved the highest AUC of 0.8473 for discriminative power, while LSTMs had the lowest Brier score of 0.1589 for probabilistic calibration.

In this research, I explore advanced deep learning methodologies to forecast the outcomes of the 2025 NCAA Division 1 Men's and Women's Basketball tournaments. Leveraging historical NCAA game data, I implement two sophisticated sequence-based models: Long Short-Term Memory (LSTM) and Transformer architectures. The predictive power of these models is augmented through comprehensive feature engineering, including team quality metrics derived from Generalized Linear Models (GLM), Elo ratings, seed differences, and aggregated box-score statistics. To evaluate the robustness and reliability of predictions, I train each model variant using both Binary Cross-Entropy (BCE) and Brier loss functions, providing insights into classification performance and probability calibration. My comparative analysis reveals that while the Transformer architecture optimized with BCE yields superior discriminative power (highest AUC of 0.8473), the LSTM model trained with Brier loss demonstrates superior probabilistic calibration (lowest Brier score of 0.1589). These findings underscore the importance of selecting appropriate model architectures and loss functions based on the specific requirements of forecasting tasks. The detailed analytical pipeline presented here serves as a reproducible framework for future predictive modeling tasks in sports analytics and beyond.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes