CLJul 14, 2025

Fusing Large Language Models with Temporal Transformers for Time Series Forecasting

Chen Su, Yuanhe Tian, Qinyu Liu, Jun Zhang, Yan Song

arXiv:2507.10098v18.33 citationsh-index: 13

Originality Incremental advance

AI Analysis

This work addresses the challenge of leveraging LLMs for time series forecasting, which is incremental as it builds on existing methods to enhance performance in a specific domain.

The paper tackles the problem of time series forecasting by integrating large language models (LLMs) with temporal Transformers to combine high-level semantic patterns with temporal dynamics, resulting in improved accuracy on benchmark datasets.

Recently, large language models (LLMs) have demonstrated powerful capabilities in performing various tasks and thus are applied by recent studies to time series forecasting (TSF) tasks, which predict future values with the given historical time series. Existing LLM-based approaches transfer knowledge learned from text data to time series prediction using prompting or fine-tuning strategies. However, LLMs are proficient at reasoning over discrete tokens and semantic patterns but are not initially designed to model continuous numerical time series data. The gaps between text and time series data lead LLMs to achieve inferior performance to a vanilla Transformer model that is directly trained on TSF data. However, the vanilla Transformers often struggle to learn high-level semantic patterns. In this paper, we design a novel Transformer-based architecture that complementarily leverages LLMs and vanilla Transformers, so as to integrate the high-level semantic representations learned by LLMs into the temporal information encoded by time series Transformers, where a hybrid representation is obtained by fusing the representations from the LLM and the Transformer. The resulting fused representation contains both historical temporal dynamics and semantic variation patterns, allowing our model to predict more accurate future values. Experiments on benchmark datasets demonstrate the effectiveness of the proposed approach.

View on arXiv PDF

Similar