LGFeb 27, 2025

Contrastive MIM: A Contrastive Mutual Information Framework for Unified Generative and Discriminative Representation Learning

arXiv:2502.19642v2h-index: 8
Originality Incremental advance
AI Analysis

This work addresses a central problem in representation learning for machine learning practitioners by providing a unified framework for generative and discriminative tasks, though it appears incremental as it builds on existing MIM methods.

The paper tackles the challenge of learning representations that generalize well to downstream tasks by introducing cMIM, a contrastive extension of the Mutual Information Machine, which outperforms MIM and InfoNCE in classification and regression tasks while maintaining reconstruction quality.

Learning representations that generalize well to unknown downstream tasks is a central challenge in representation learning. Existing approaches such as contrastive learning, self-supervised masking, and denoising auto-encoders address this challenge with varying trade-offs. In this paper, we introduce the {contrastive Mutual Information Machine} (cMIM), a probabilistic framework that augments the Mutual Information Machine (MIM) with a novel contrastive objective. While MIM maximizes mutual information between inputs and latent variables and encourages clustering of latent codes, its representations underperform on discriminative tasks compared to state-of-the-art alternatives. cMIM addresses this limitation by enforcing global discriminative structure while retaining MIM's generative strengths. We present two main contributions: (1) we propose cMIM, a contrastive extension of MIM that eliminates the need for positive data augmentation and is robust to batch size, unlike InfoNCE-based methods; (2) we introduce {informative embeddings}, a general technique for extracting enriched representations from encoder--decoder models that substantially improve discriminative performance without additional training, and which apply broadly beyond MIM. Empirical results demonstrate that cMIM consistently outperforms MIM and InfoNCE in classification and regression tasks, while preserving comparable reconstruction quality. These findings suggest that cMIM provides a unified framework for learning representations that are simultaneously effective for discriminative and generative applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes