LG AI CVMay 9, 2025

Wasserstein Distances Made Explainable: Insights into Dataset Shifts and Transport Phenomena

Philip Naumann, Jacob Kauffmann, Grégoire Montavon

arXiv:2505.06123v19.42 citationsh-index: 40IEEE Trans Pattern Anal Mach Intell

Originality Incremental advance

AI Analysis

This work addresses the need for interpretability in Wasserstein distance analysis for researchers and practitioners dealing with dataset shifts and transport phenomena, though it appears incremental as it builds on existing frameworks.

The paper tackled the problem of understanding what factors contribute to high or low Wasserstein distances when comparing data distributions, and proposed an Explainable AI solution that accurately attributes these distances to data components like subgroups or features, achieving high accuracy across diverse datasets.

Wasserstein distances provide a powerful framework for comparing data distributions. They can be used to analyze processes over time or to detect inhomogeneities within data. However, simply calculating the Wasserstein distance or analyzing the corresponding transport map (or coupling) may not be sufficient for understanding what factors contribute to a high or low Wasserstein distance. In this work, we propose a novel solution based on Explainable AI that allows us to efficiently and accurately attribute Wasserstein distances to various data components, including data subgroups, input features, or interpretable subspaces. Our method achieves high accuracy across diverse datasets and Wasserstein distance specifications, and its practical utility is demonstrated in two use cases.

View on arXiv PDF

Similar