LG AIMay 18, 2025

Unsupervised Invariant Risk Minimization

arXiv:2505.12506v27.11 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust representation learning in unsupervised settings for machine learning applications, but it is incremental as it extends existing IRM concepts to unlabeled data.

The authors tackled the problem of learning invariant representations without labeled data by proposing an unsupervised framework for Invariant Risk Minimization, which uses feature distribution alignment and includes methods like PICA and VIAE, showing effectiveness in synthetic and modified MNIST datasets for capturing invariant structure and generalizing across environments.

We propose a novel unsupervised framework for \emph{Invariant Risk Minimization} (IRM), extending the concept of invariance to settings where labels are unavailable. Traditional IRM methods rely on labeled data to learn representations that are robust to distributional shifts across environments. In contrast, our approach redefines invariance through feature distribution alignment, enabling robust representation learning from unlabeled data. We introduce two methods within this framework: Principal Invariant Component Analysis (PICA), a linear method that extracts invariant directions under Gaussian assumptions, and Variational Invariant Autoencoder (VIAE), a deep generative model that disentangles environment-invariant and environment-dependent latent factors. Our approach is based on a novel ``unsupervised'' structural causal model and supports environment-conditioned sample-generation and intervention. Empirical evaluations on synthetic dataset and modified versions of MNIST demonstrate the effectiveness of our methods in capturing invariant structure, preserving relevant information, and generalizing across environments without access to labels.

View on arXiv PDF Code

Similar