LGAIDec 17, 2024

A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting

arXiv:2412.12883v11 citationsh-index: 7
Originality Synthesis-oriented
AI Analysis

This addresses computational efficiency for deploying time series forecasting models in resource-constrained settings like embedded devices, but is incremental as it benchmarks existing pruning techniques.

The study evaluated pruning methods for Transformer-based time series forecasting models to address their high computational demands, finding that some models can be pruned to high sparsity levels while maintaining or improving performance, though structured pruning did not yield significant time savings.

The current landscape in time-series forecasting is dominated by Transformer-based models. Their high parameter count and corresponding demand in computational resources pose a challenge to real-world deployment, especially for commercial and scientific applications with low-power embedded devices. Pruning is an established approach to reduce neural network parameter count and save compute. However, the implications and benefits of pruning Transformer-based models for time series forecasting are largely unknown. To close this gap, we provide a comparative benchmark study by evaluating unstructured and structured pruning on various state-of-the-art multivariate time series models. We study the effects of these pruning strategies on model predictive performance and computational aspects like model size, operations, and inference time. Our results show that certain models can be pruned even up to high sparsity levels, outperforming their dense counterpart. However, fine-tuning pruned models is necessary. Furthermore, we demonstrate that even with corresponding hardware and software support, structured pruning is unable to provide significant time savings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes