LG AI ME MLOct 17, 2025

Beyond Accuracy: Are Time Series Foundation Models Well-Calibrated?

Coen Adler, Yuxin Chang, Felix Draxler, Samar Abdi, Padhraic Smyth

arXiv:2510.16060v17.11 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses the calibration issue for time series foundation models, which is critical for practical applications, though it is incremental as it focuses on evaluating existing models rather than proposing new ones.

The paper investigates the calibration properties of five recent time series foundation models and two baselines, finding that these foundation models are consistently better calibrated than baselines and avoid systematic over- or under-confidence.

The recent development of foundation models for time series data has generated considerable interest in using such models across a variety of applications. Although foundation models achieve state-of-the-art predictive performance, their calibration properties remain relatively underexplored, despite the fact that calibration can be critical for many practical applications. In this paper, we investigate the calibration-related properties of five recent time series foundation models and two competitive baselines. We perform a series of systematic evaluations assessing model calibration (i.e., over- or under-confidence), effects of varying prediction heads, and calibration under long-term autoregressive forecasting. We find that time series foundation models are consistently better calibrated than baseline models and tend not to be either systematically over- or under-confident, in contrast to the overconfidence often seen in other deep learning models.

View on arXiv PDF

Similar