LG AIJun 26, 2025

mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale

Xiaona Zhou, Constantin Brif, Ismini Lourentzou

arXiv:2506.21550v113.06 citationsh-index: 18Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the problem of benchmarking and model selection for anomaly detection in domains like healthcare and cybersecurity, but it is incremental as it builds on prior findings with a larger-scale evaluation.

The paper tackled the challenge of multivariate time series anomaly detection by introducing mTSBench, a large benchmark with 344 labeled time series across 19 datasets, which evaluated 24 methods and found that no single detector excels and model selection methods are far from optimal.

Multivariate time series anomaly detection (MTS-AD) is critical in domains like healthcare, cybersecurity, and industrial monitoring, yet remains challenging due to complex inter-variable dependencies, temporal dynamics, and sparse anomaly labels. We introduce mTSBench, the largest benchmark to date for MTS-AD and unsupervised model selection, spanning 344 labeled time series across 19 datasets and 12 diverse application domains. mTSBench evaluates 24 anomaly detection methods, including large language model (LLM)-based detectors for multivariate time series, and systematically benchmarks unsupervised model selection techniques under standardized conditions. Consistent with prior findings, our results confirm that no single detector excels across datasets, underscoring the importance of model selection. However, even state-of-the-art selection methods remain far from optimal, revealing critical gaps. mTSBench provides a unified evaluation suite to enable rigorous, reproducible comparisons and catalyze future advances in adaptive anomaly detection and robust model selection.

View on arXiv PDF Code

Similar