LGAIMar 5, 2024

InjectTST: A Transformer Method of Injecting Global Information into Independent Channels for Long Time Series Forecasting

arXiv:2403.02814v12 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses a key problem in multivariate time series forecasting for applications requiring robustness and accuracy, representing an incremental advancement in model design.

The paper tackles the challenge of balancing channel independence and dependency in multivariate time series forecasting by proposing InjectTST, a Transformer-based method that injects global information into independent channels, achieving stable improvements over state-of-the-art models.

Transformer has become one of the most popular architectures for multivariate time series (MTS) forecasting. Recent Transformer-based MTS models generally prefer channel-independent structures with the observation that channel independence can alleviate noise and distribution drift issues, leading to more robustness. Nevertheless, it is essential to note that channel dependency remains an inherent characteristic of MTS, carrying valuable information. Designing a model that incorporates merits of both channel-independent and channel-mixing structures is a key to further improvement of MTS forecasting, which poses a challenging conundrum. To address the problem, an injection method for global information into channel-independent Transformer, InjectTST, is proposed in this paper. Instead of designing a channel-mixing model directly, we retain the channel-independent backbone and gradually inject global information into individual channels in a selective way. A channel identifier, a global mixing module and a self-contextual attention module are devised in InjectTST. The channel identifier can help Transformer distinguish channels for better representation. The global mixing module produces cross-channel global information. Through the self-contextual attention module, the independent channels can selectively concentrate on useful global information without robustness degradation, and channel mixing is achieved implicitly. Experiments indicate that InjectTST can achieve stable improvement compared with state-of-the-art models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes