LGAIFeb 19, 2025

Position: There are no Champions in Long-Term Time Series Forecasting

arXiv:2502.14045v19 citationsh-index: 5Trans. Mach. Learn. Res.
Originality Synthesis-oriented
AI Analysis

This addresses reliability issues for researchers and practitioners in time series forecasting, but it is incremental as it critiques existing practices rather than introducing new models.

The paper tackles the problem of inconsistent benchmarking in long-term time series forecasting by conducting a reproducible evaluation of top models on 14 datasets, finding that experimental changes can shift perceived state-of-the-art results, highlighting the need for standardized methods.

Recent advances in long-term time series forecasting have introduced numerous complex prediction models that consistently outperform previously published architectures. However, this rapid progression raises concerns regarding inconsistent benchmarking and reporting practices, which may undermine the reliability of these comparisons. Our position emphasizes the need to shift focus away from pursuing ever-more complex models and towards enhancing benchmarking practices through rigorous and standardized evaluation methods. To support our claim, we first perform a broad, thorough, and reproducible evaluation of the top-performing models on the most popular benchmark by training 3,500+ networks over 14 datasets. Then, through a comprehensive analysis, we find that slight changes to experimental setups or current evaluation metrics drastically shift the common belief that newly published results are advancing the state of the art. Our findings suggest the need for rigorous and standardized evaluation methods that enable more substantiated claims, including reproducible hyperparameter setups and statistical testing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes