LGMar 11, 2025

Learning to Match Unpaired Data with Minimum Entropy Coupling

arXiv:2503.08501v24 citationsh-index: 3ICML
Originality Highly original
AI Analysis

This addresses the challenge of modality coupling for machine learning applications with continuous data, representing an incremental advance over existing discrete-focused methods.

The paper tackles the problem of learning joint distributions from unpaired multimodal data by proposing a novel method to solve the continuous Minimum Entropy Coupling problem, using generative diffusion models to minimize joint entropy while relaxing marginal constraints, and demonstrates its effectiveness in tasks like single-cell multi-omics alignment and image translation, outperforming specialized methods.

Multimodal data is a precious asset enabling a variety of downstream tasks in machine learning. However, real-world data collected across different modalities is often not paired, which is a significant challenge to learn a joint distribution. A prominent approach to address the modality coupling problem is Minimum Entropy Coupling (MEC), which seeks to minimize the joint Entropy, while satisfying constraints on the marginals. Existing approaches to the MEC problem focus on finite, discrete distributions, limiting their application for cases involving continuous data. In this work, we propose a novel method to solve the continuous MEC problem, using well-known generative diffusion models that learn to approximate and minimize the joint Entropy through a cooperative scheme, while satisfying a relaxed version of the marginal constraints. We empirically demonstrate that our method, DDMEC, is general and can be easily used to address challenging tasks, including unsupervised single-cell multi-omics data alignment and unpaired image translation, outperforming specialized methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes