LGMay 15, 2025

ChronoSteer: Bridging Large Language Model and Time Series Foundation Model via Synthetic Data

Chengsen Wang, Qi Qi, Zhongwen Rao, Lujia Pan, Jingyu Wang, Jianxin Liao

arXiv:2505.10083v19.43 citationsh-index: 33

Originality Incremental advance

AI Analysis

This addresses the challenge of leveraging multimodal data for time series forecasting, which is incremental as it builds on existing LLM and TSFM capabilities.

The paper tackles the problem of integrating textual information with time series data for forecasting by proposing ChronoSteer, a multimodal model that uses an LLM to generate revision instructions for a time series foundation model, achieving a 25.7% accuracy improvement over unimodal methods and a 22.5% gain over prior multimodal approaches.

Conventional forecasting methods rely on unimodal time series data, limiting their ability to exploit rich textual information. Recently, large language models (LLMs) and time series foundation models (TSFMs) have demonstrated powerful capability in textual reasoning and temporal modeling, respectively. Integrating the strengths of both to construct a multimodal model that concurrently leverages both temporal and textual information for future inference has emerged as a critical research challenge. To address the scarcity of event-series paired data, we propose a decoupled framework: an LLM is employed to transform textual events into revision instructions, which are then used to steer the output of TSFM. To implement this framework, we introduce ChronoSteer, a multimodal TSFM that can be steered through textual revision instructions, effectively bridging LLM and TSFM. Moreover, to mitigate the shortage of cross-modal instruction-series paired data, we devise a two-stage training strategy based on synthetic data. In addition, we also construct a high-quality multimodal time series forecasting benchmark to address the information leakage concerns during evaluation. After integrating with an LLM, ChronoSteer, which is trained exclusively on synthetic data, achieves a 25.7% improvement in prediction accuracy compared to the unimodal backbone and a 22.5% gain over the previous state-of-the-art multimodal method.

View on arXiv PDF

Similar