Hugues Turbé

CV
h-index21
4papers
147citations
Novelty41%
AI Score33

4 Papers

LGJul 29, 2024
Revisiting the robustness of post-hoc interpretability methods

Jiawen Wei, Hugues Turbé, Gianmarco Mengaldo

Post-hoc interpretability methods play a critical role in explainable artificial intelligence (XAI), as they pinpoint portions of data that a trained deep learning model deemed important to make a decision. However, different post-hoc interpretability methods often provide different results, casting doubts on their accuracy. For this reason, several evaluation strategies have been proposed to understand the accuracy of post-hoc interpretability. Many of these evaluation strategies provide a coarse-grained assessment -- i.e., they evaluate how the performance of the model degrades on average by corrupting different data points across multiple samples. While these strategies are effective in selecting the post-hoc interpretability method that is most reliable on average, they fail to provide a sample-level, also referred to as fine-grained, assessment. In other words, they do not measure the robustness of post-hoc interpretability methods. We propose an approach and two new metrics to provide a fine-grained assessment of post-hoc interpretability methods. We show that the robustness is generally linked to its coarse-grained performance.

CVFeb 26, 2025Code
Tell me why: Visual foundation models as self-explainable classifiers

Hugues Turbé, Mina Bjelogrlic, Gianmarco Mengaldo et al.

Visual foundation models (VFMs) have become increasingly popular due to their state-of-the-art performance. However, interpretability remains crucial for critical applications. In this sense, self-explainable models (SEM) aim to provide interpretable classifiers that decompose predictions into a weighted sum of interpretable concepts. Despite their promise, recent studies have shown that these explanations often lack faithfulness. In this work, we combine VFMs with a novel prototypical architecture and specialized training objectives. By training only a lightweight head (approximately 1M parameters) on top of frozen VFMs, our approach (ProtoFM) offers an efficient and interpretable solution. Evaluations demonstrate that our approach achieves competitive classification performance while outperforming existing models across a range of interpretability metrics derived from the literature. Code is available at https://github.com/hturbe/proto-fm.

CVJun 14, 2024Code
ProtoS-ViT: Visual foundation models for sparse self-explainable classifications

Hugues Turbé, Mina Bjelogrlic, Gianmarco Mengaldo et al.

Prototypical networks aim to build intrinsically explainable models based on the linear summation of concepts. Concepts are coherent entities that we, as humans, can recognize and associate with a certain object or entity. However, important challenges remain in the fair evaluation of explanation quality provided by these models. This work first proposes an extensive set of quantitative and qualitative metrics which allow to identify drawbacks in current prototypical networks. It then introduces a novel architecture which provides compact explanations, outperforming current prototypical models in terms of explanation quality. Overall, the proposed architecture demonstrates how frozen pre-trained ViT backbones can be effectively turned into prototypical models for both general and domain-specific tasks, in our case biomedical image classifiers. Code is available at \url{https://github.com/hturbe/protosvit}.

LGFeb 11, 2022
Evaluation of post-hoc interpretability methods in time-series classification

Hugues Turbé, Mina Bjelogrlic, Christian Lovis et al.

Post-hoc interpretability methods are critical tools to explain neural-network results. Several post-hoc methods have emerged in recent years, but when applied to a given task, they produce different results, raising the question of which method is the most suitable to provide correct post-hoc interpretability. To understand the performance of each method, quantitative evaluation of interpretability methods is essential. However, currently available frameworks have several drawbacks which hinders the adoption of post-hoc interpretability methods, especially in high-risk sectors. In this work, we propose a framework with quantitative metrics to assess the performance of existing post-hoc interpretability methods in particular in time series classification. We show that several drawbacks identified in the literature are addressed, namely dependence on human judgement, retraining, and shift in the data distribution when occluding samples. We additionally design a synthetic dataset with known discriminative features and tunable complexity. The proposed methodology and quantitative metrics can be used to understand the reliability of interpretability methods results obtained in practical applications. In turn, they can be embedded within operational workflows in critical fields that require accurate interpretability results for e.g., regulatory policies.