CV AIOct 14, 2025

Hybrid Explanation-Guided Learning for Transformer-Based Chest X-Ray Diagnosis

Shelley Zixin Shu, Haozhe Luo, Alexander Poellinger, Mauricio Reyes

arXiv:2510.12704v13.6h-index: 4

Originality Incremental advance

AI Analysis

This work addresses biases in medical imaging AI for clinicians, but it is incremental as it builds on existing explanation-guided learning methods.

The paper tackles the problem of spurious correlations and limited generalization in transformer-based chest X-ray diagnosis by proposing a Hybrid Explanation-Guided Learning framework, which outperforms state-of-the-art methods in classification accuracy and attention alignment with human expertise.

Transformer-based deep learning models have demonstrated exceptional performance in medical imaging by leveraging attention mechanisms for feature representation and interpretability. However, these models are prone to learning spurious correlations, leading to biases and limited generalization. While human-AI attention alignment can mitigate these issues, it often depends on costly manual supervision. In this work, we propose a Hybrid Explanation-Guided Learning (H-EGL) framework that combines self-supervised and human-guided constraints to enhance attention alignment and improve generalization. The self-supervised component of H-EGL leverages class-distinctive attention without relying on restrictive priors, promoting robustness and flexibility. We validate our approach on chest X-ray classification using the Vision Transformer (ViT), where H-EGL outperforms two state-of-the-art Explanation-Guided Learning (EGL) methods, demonstrating superior classification accuracy and generalization capability. Additionally, it produces attention maps that are better aligned with human expertise.

View on arXiv PDF

Similar