LGFeb 10

Statistical benchmarking of transformer models in low signal-to-noise time-series forecasting

arXiv:2602.09869v11.4
Originality Incremental advance
AI Analysis

This work addresses forecasting challenges in noisy, data-scarce domains like finance or climate, but it is incremental as it builds on existing transformer architectures with a new sparsification technique.

The paper tackled the problem of multivariate time-series forecasting in low-data, low signal-to-noise regimes by evaluating transformer models, showing that two-way attention transformers outperform standard baselines across various settings, with dynamic sparsification improving performance in noisy environments where correlations are as low as a few percent.

We study the performance of transformer architectures for multivariate time-series forecasting in low-data regimes consisting of only a few years of daily observations. Using synthetically generated processes with known temporal and cross-sectional dependency structures and varying signal-to-noise ratios, we conduct bootstrapped experiments that enable direct evaluation via out-of-sample correlations with the optimal ground-truth predictor. We show that two-way attention transformers, which alternate between temporal and cross-sectional self-attention, can outperform standard baselines-Lasso, boosting methods, and fully connected multilayer perceptrons-across a wide range of settings, including low signal-to-noise regimes. We further introduce a dynamic sparsification procedure for attention matrices applied during training, and demonstrate that it becomes significantly effective in noisy environments, where the correlation between the target variable and the optimal predictor is on the order of a few percent. Analysis of the learned attention patterns reveals interpretable structure and suggests connections to sparsity-inducing regularization in classical regression, providing insight into why these models generalize effectively under noise.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes