LGAIOct 9, 2025

Synthetic Series-Symbol Data Generation for Time Series Foundation Models

arXiv:2510.08445v3h-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses data scarcity issues for researchers and practitioners in time series analysis, though it is incremental as it builds on existing foundation model approaches with a novel data generation method.

The paper tackles the problem of training data scarcity and imbalance for time series foundation models by introducing a series-symbol data generation mechanism, which enables the creation of synthetic time series data paired with symbolic expressions. The result is SymTime, a pre-trained foundation model that achieves competitive performance across five major time series analysis tasks, rivaling models trained on real-world datasets.

Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as training data scarcity and imbalance continue to hinder their development. Inspired by complex dynamic system theories, we design a series-symbol data generation mechanism, enabling the unrestricted creation of high-quality time series data paired with corresponding symbolic expressions. To leverage series-symbol data pairs with strong correlations, we develop SymTime, a pre-trained foundation model for enhancing time series representation using symbolic information. SymTime demonstrates competitive performance across five major TSA tasks when fine-tunes with downstream tasks, rivaling foundation models pre-trained on real-world datasets. This approach underscores the potential of series-symbol data generation and pretraining mechanisms in overcoming data scarcity and enhancing task performance. The code is available at https://github.com/wwhenxuan/SymTime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes