Tiny-TSM: Efficiently Training a Lightweight SOTA Time Series Foundation Model
This provides a practical, resource-efficient solution for time series forecasting, though it is incremental in improving efficiency over existing methods.
The authors tackled the problem of training efficient time series foundation models by introducing Tiny-TSM, which achieves state-of-the-art performance on medium- and long-term forecasting tasks with 23M parameters trained in less than a week on a single A100 GPU, outperforming larger models.
We present Tiny-TSM, a time series foundation model characterized by small scale, economical training, and state-of-the-art performance. It comprises 23M total parameters, trained on a single A100 GPU in less than a week using a new synthetic data generation and data augmentation pipeline (SynthTS). Without any neural architecture search, hyperparameter tuning, or scaling up model size, Tiny-TSM achieves state-of-the-art performance on a wide range of time series benchmark datasets, often outperforming much larger models and even matching the performance of much larger, industrial-scale, likely highly tuned foundation models. Specifically, Tiny-TSM outperforms all other time series foundation models we evaluated on medium- and long-term forecasting tasks under MSE loss, while short-term accuracy is still competitive with state-of-the-art models. We also introduce a causal input normalization scheme that enables time series models to be trained with dense next-token prediction loss, significantly accelerating convergence speed and reducing training time. All experiments were conducted on a single A100 GPU, illustrating the practicality of the proposed approach in a resource-constrained setting.