LGFeb 2

Exposing Vulnerabilities in Explanation for Time Series Classifiers via Dual-Target Attacks

arXiv:2602.02763v1
Originality Highly original
AI Analysis

This exposes a critical vulnerability for users relying on interpretable time series deep learning systems in domains like healthcare or finance, where trustworthy decisions are essential, though it is incremental in advancing adversarial attack methods.

The paper tackles the problem of assuming explanation stability implies robustness in interpretable time series classifiers, showing that predictions and explanations can be adversarially decoupled to cause targeted misclassification while explanations remain plausible. The result is that TSEF, a dual-target attack, consistently reveals this vulnerability across multiple datasets and explainers, demonstrating explanation stability is a misleading proxy for robustness.

Interpretable time series deep learning systems are often assessed by checking temporal consistency on explanations, implicitly treating this as evidence of robustness. We show that this assumption can fail: Predictions and explanations can be adversarially decoupled, enabling targeted misclassification while the explanation remains plausible and consistent with a chosen reference rationale. We propose TSEF (Time Series Explanation Fooler), a dual-target attack that jointly manipulates the classifier and explainer outputs. In contrast to single-objective misclassification attacks that disrupt explanation and spread attribution mass broadly, TSEF achieves targeted prediction changes while keeping explanations consistent with the reference. Across multiple datasets and explainer backbones, our results consistently reveal that explanation stability is a misleading proxy for decision robustness and motivate coupling-aware robustness evaluations for trustworthy time series tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes