LGOct 27, 2025

Toward Interpretable Evaluation Measures for Time Series Segmentation

Félix Chavelli, Paul Boniol, Michaël Thomazo

arXiv:2510.23261v11 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses the need for better interpretability in evaluation measures for time series segmentation, which is incremental as it builds on existing measures like ARI.

The paper tackled the problem of limited interpretability in evaluating time series segmentation methods by introducing two new measures, WARI and SMS, which empirically showed more accurate assessment and uncovered error insights on synthetic and real-world benchmarks.

Time series segmentation is a fundamental task in analyzing temporal data across various domains, from human activity recognition to energy monitoring. While numerous state-of-the-art methods have been developed to tackle this problem, the evaluation of their performance remains critically limited. Existing measures predominantly focus on change point accuracy or rely on point-based measures such as Adjusted Rand Index (ARI), which fail to capture the quality of the detected segments, ignore the nature of errors, and offer limited interpretability. In this paper, we address these shortcomings by introducing two novel evaluation measures: WARI (Weighted Adjusted Rand Index), that accounts for the position of segmentation errors, and SMS (State Matching Score), a fine-grained measure that identifies and scores four fundamental types of segmentation errors while allowing error-specific weighting. We empirically validate WARI and SMS on synthetic and real-world benchmarks, showing that they not only provide a more accurate assessment of segmentation quality but also uncover insights, such as error provenance and type, that are inaccessible with traditional measures.

View on arXiv PDF

Similar