LGMar 31

One-for-All: A Lightweight Stabilized and Parameter-Efficient Pre-trained LLM for Time Series Forecasting

Prasanjit Dey, Soumyabrata Dev, Bianca Schoen-Phelan

arXiv:2603.2975618.7

AI Analysis

This work addresses the computational and memory inefficiencies of deploying LLMs for time-series analysis, enabling edge deployment in domains like healthcare and finance, though it is incremental as it builds on existing PEFT methods.

The paper tackles the challenge of adapting pre-trained LLMs for multivariate time-series forecasting by introducing One-for-All, a lightweight framework that uses Gaussian Rank-Stabilized Low-Rank Adapters (rsLoRA) to reduce trainable parameters by up to 21× and memory footprint by 168-1,776× while matching state-of-the-art forecasting accuracy (MSE=0.33).

We address the challenge of adapting pre-trained Large Language Models (LLMs) for multivariate time-series analysis, where their deployment is often hindered by prohibitive computational and memory demands. Our solution, One-for-All, introduces Gaussian Rank-Stabilized Low-Rank Adapters (rsLoRA) to enable parameter-efficient fine-tuning of frozen LLMs. While inspired by LoRA, rsLoRA introduces a mathematically grounded rank-stabilization mechanism that enables provable gradient stability at low ranks a novel contribution absent in prior PEFT methods. Our framework injects trainable rank decomposition matrices (rank 16) into positional embeddings and output layers, while keeping self-attention weights fixed. This design reduces trainable parameters by 6.8$\times$ (vs. TimesNet), 21$\times$ (vs. GPT4TS), and 11.8$\times$ (vs. TIME-LLM), while achieving a 168-1,776$\times$ smaller memory footprint (2.2MiB vs. 340MiB-4.18GiB in SOTA models). Rigorous evaluation across six time-series tasks demonstrates that One-for-All achieves state-of-the-art efficiency-accuracy trade-offs: 5.5$\times$ higher parameter efficiency (MSE=5.50) than TimesNet and 21$\times$ better than GPT4TS, while matching their forecasting accuracy (MSE=0.33). The framework's stability is validated through consistent performance across diverse horizons (96-720 steps) and datasets (ETT, Weather, M3, M4), with 98.3% fewer parameters than conventional transformers. These advances enable deployment on edge devices for healthcare, finance, and environmental monitoring without compromising performance.

View on arXiv PDF

Similar