CVJan 21, 2025

Explainability for Vision Foundation Models: A Survey

arXiv:2501.12203v18 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

It addresses the problem of interpreting complex AI models for researchers and practitioners, but is incremental as it synthesizes existing work rather than introducing novel methods.

This survey tackles the challenge of explainability for vision foundation models by compiling and categorizing research, reviewing evaluation methods, and identifying future directions, without presenting new experimental results.

As artificial intelligence systems become increasingly integrated into daily life, the field of explainability has gained significant attention. This trend is particularly driven by the complexity of modern AI models and their decision-making processes. The advent of foundation models, characterized by their extensive generalization capabilities and emergent uses, has further complicated this landscape. Foundation models occupy an ambiguous position in the explainability domain: their complexity makes them inherently challenging to interpret, yet they are increasingly leveraged as tools to construct explainable models. In this survey, we explore the intersection of foundation models and eXplainable AI (XAI) in the vision domain. We begin by compiling a comprehensive corpus of papers that bridge these fields. Next, we categorize these works based on their architectural characteristics. We then discuss the challenges faced by current research in integrating XAI within foundation models. Furthermore, we review common evaluation methodologies for these combined approaches. Finally, we present key observations and insights from our survey, offering directions for future research in this rapidly evolving field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes