LGFeb 23, 2022

A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting

arXiv:2202.11402v12 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in time series forecasting for domains like equipment life cycle and traffic flow, though it appears incremental as it builds on existing Transformer architectures.

The paper tackled the problem of Transformer-based time series forecasting being insensitive to small, decisive time segments by proposing a differential attention fusion model with differential layers, neighbor attention, and sliding fusion mechanisms, achieving results favorably comparable to state-of-the-art methods on three datasets.

Time series forecasting is widely used in the fields of equipment life cycle forecasting, weather forecasting, traffic flow forecasting, and other fields. Recently, some scholars have tried to apply Transformer to time series forecasting because of its powerful parallel training ability. However, the existing Transformer methods do not pay enough attention to the small time segments that play a decisive role in prediction, making it insensitive to small changes that affect the trend of time series, and it is difficult to effectively learn continuous time-dependent features. To solve this problem, we propose a differential attention fusion model based on Transformer, which designs the differential layer, neighbor attention, sliding fusion mechanism, and residual layer on the basis of classical Transformer architecture. Specifically, the differences of adjacent time points are extracted and focused by difference and neighbor attention. The sliding fusion mechanism fuses various features of each time point so that the data can participate in encoding and decoding without losing important information. The residual layer including convolution and LSTM further learns the dependence between time points and enables our model to carry out deeper training. A large number of experiments on three datasets show that the prediction results produced by our method are favorably comparable to the state-of-the-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes