CL AIMar 5, 2025

Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs

Haoran Fan, Bin Li, Yixuan Weng, Shoujun Zhou

arXiv:2503.03594v22 citationsh-index: 12Has CodeJ Supercomput

Originality Highly original

AI Analysis

This work addresses computational inefficiency for practitioners deploying time series forecasting models, offering a more resource-efficient alternative to large LLMs.

The paper tackles the problem of deploying large language models (LLMs) for time series forecasting by proposing SMETimes, a sub-3B parameter small language model (SLM) that achieves state-of-the-art performance on five benchmark datasets while reducing training time by 3.8x and memory consumption by 5.2x compared to 7B-parameter LLM baselines.

While LLMs have demonstrated remarkable potential in time series forecasting, their practical deployment remains constrained by excessive computational demands and memory footprints. Existing LLM-based approaches typically suffer from three critical limitations: Inefficient parameter utilization in handling numerical time series patterns; Modality misalignment between continuous temporal signals and discrete text embeddings; and Inflexibility for real-time expert knowledge integration. We present SMETimes, the first systematic investigation of sub-3B parameter SLMs for efficient and accurate time series forecasting. Our approach centers on three key innovations: A statistically-enhanced prompting mechanism that bridges numerical time series with textual semantics through descriptive statistical features; A adaptive fusion embedding architecture that aligns temporal patterns with language model token spaces through learnable parameters; And a dynamic mixture-of-experts framework enabled by SLMs' computational efficiency, adaptively combining base predictions with domain-specific models. Extensive evaluations across seven benchmark datasets demonstrate that our 3B-parameter SLM achieves state-of-the-art performance on five primary datasets while maintaining 3.8x faster training and 5.2x lower memory consumption compared to 7B-parameter LLM baselines. Notably, the proposed model exhibits better learning capabilities, achieving 12.3% lower MSE than conventional LLM. Ablation studies validate that our statistical prompting and cross-modal fusion modules respectively contribute 15.7% and 18.2% error reduction in long-horizon forecasting tasks. By redefining the efficiency-accuracy trade-off landscape, this work establishes SLMs as viable alternatives to resource-intensive LLMs for practical time series forecasting. Code and models are available at https://github.com/xiyan1234567/SMETimes.

View on arXiv PDF Code

Similar