LGSep 22, 2025

MTM: A Multi-Scale Token Mixing Transformer for Irregular Multivariate Time Series Classification

Shuhan Zhong, Weipeng Zhuo, Sizhe Song, Guanyao Li, Zhongyi Yu, S. -H. Gary Chan

arXiv:2509.17809v17.11 citationsh-index: 10KDD

Originality Incremental advance

AI Analysis

This addresses classification challenges for irregular multivariate time series data, representing an incremental advance with specific performance gains.

The paper tackles the problem of poor channel-wise modeling in irregular multivariate time series classification by proposing MTM, a multi-scale token mixing transformer, which achieves up to 3.8% improvement in AUPRC on benchmarks.

Irregular multivariate time series (IMTS) is characterized by the lack of synchronized observations across its different channels. In this paper, we point out that this channel-wise asynchrony can lead to poor channel-wise modeling of existing deep learning methods. To overcome this limitation, we propose MTM, a multi-scale token mixing transformer for the classification of IMTS. We find that the channel-wise asynchrony can be alleviated by down-sampling the time series to coarser timescales, and propose to incorporate a masked concat pooling in MTM that gradually down-samples IMTS to enhance the channel-wise attention modules. Meanwhile, we propose a novel channel-wise token mixing mechanism which proactively chooses important tokens from one channel and mixes them with other channels, to further boost the channel-wise learning of our model. Through extensive experiments on real-world datasets and comparison with state-of-the-art methods, we demonstrate that MTM consistently achieves the best performance on all the benchmarks, with improvements of up to 3.8% in AUPRC for classification.

View on arXiv PDF

Similar