LGAIMLFeb 24, 2025

Class-Dependent Perturbation Effects in Evaluating Time Series Attributions

arXiv:2502.17022v25 citationsh-index: 24xAI
Originality Incremental advance
AI Analysis

This work addresses a previously overlooked issue in evaluating explainable AI methods for time series, which is incremental as it builds on existing perturbation-based metrics.

The study identified class-dependent effects in perturbation-based evaluation metrics for feature attributions in time series classification, showing varying effectiveness across classes, with the most effective strategies exhibiting the most pronounced differences. It proposed a class-aware evaluation framework to account for these effects, particularly benefiting class-imbalanced datasets.

As machine learning models become increasingly prevalent in time series applications, Explainable Artificial Intelligence (XAI) methods are essential for understanding their predictions. Within XAI, feature attribution methods aim to identify which input features contribute the most to a model's prediction, with their evaluation typically relying on perturbation-based metrics. Through systematic empirical analysis across multiple datasets, model architectures, and perturbation strategies, we reveal previously overlooked class-dependent effects in these metrics: they show varying effectiveness across classes, achieving strong results for some while remaining less sensitive to others. In particular, we find that the most effective perturbation strategies often demonstrate the most pronounced class differences. Our analysis suggests that these effects arise from the learned biases of classifiers, indicating that perturbation-based evaluation may reflect specific model behaviors rather than intrinsic attribution quality. We propose an evaluation framework with a class-aware penalty term to help assess and account for these effects in evaluating feature attributions, offering particular value for class-imbalanced datasets. Although our analysis focuses on time series classification, these class-dependent effects likely extend to other structured data domains where perturbation-based evaluation is common.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes