LGCVFeb 8, 2023

Shortcut Detection with Variational Autoencoders

arXiv:2302.04246v24 citationsh-index: 8
AI Analysis

This addresses the challenge of ensuring models generalize well in real-world ML applications, though it appears incremental as it builds on existing VAE methods for a known bottleneck.

The paper tackles the problem of detecting spurious correlations (shortcuts) in image and audio datasets using variational autoencoders, identifying previously undiscovered shortcuts in real-world datasets.

For real-world applications of machine learning (ML), it is essential that models make predictions based on well-generalizing features rather than spurious correlations in the data. The identification of such spurious correlations, also known as shortcuts, is a challenging problem and has so far been scarcely addressed. In this work, we present a novel approach to detect shortcuts in image and audio datasets by leveraging variational autoencoders (VAEs). The disentanglement of features in the latent space of VAEs allows us to discover feature-target correlations in datasets and semi-automatically evaluate them for ML shortcuts. We demonstrate the applicability of our method on several real-world datasets and identify shortcuts that have not been discovered before.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes