LGAIPFFeb 23, 2025

Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

arXiv:2502.16627v412 citationsh-index: 4Int J Comput Appl
Originality Synthesis-oriented
AI Analysis

This addresses energy efficiency for transformer deployment in resource-constrained environments, but it is incremental as it applies existing optimization methods to a specific domain.

The study tackled the computational demands of transformer models in time series classification by investigating optimization techniques like pruning and quantization, finding that static quantization reduced energy consumption by 29.14% and L1 pruning improved inference speed by 63% with minimal accuracy loss.

The increasing computational demands of transformer models in time series classification necessitate effective optimization strategies for energy-efficient deployment. Our study presents a systematic investigation of optimization techniques, focusing on structured pruning and quantization methods for transformer architectures. Through extensive experimentation on three distinct datasets (RefrigerationDevices, ElectricDevices, and PLAID), we quantitatively evaluate model performance and energy efficiency across different transformer configurations. Our experimental results demonstrate that static quantization reduces energy consumption by 29.14% while maintaining classification performance, and L1 pruning achieves a 63% improvement in inference speed with minimal accuracy degradation. Our findings provide valuable insights into the effectiveness of optimization strategies for transformer-based time series classification, establishing a foundation for efficient model deployment in resource-constrained environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes