LGMLJan 18, 2025

Mutual Regression Distance

arXiv:2501.10617v1h-index: 3
Originality Highly original
AI Analysis

This work addresses a fundamental limitation in distribution distance measures for machine learning tasks, offering a more efficient and effective alternative to existing methods.

The authors tackled the problem of measuring dissimilarity between distributions by proposing Mutual Regression Distance (MRD), a novel distance that exploits manifold properties of data, and demonstrated its effectiveness with lower computational complexity and superior performance in applications like clustering and domain adaptation.

The maximum mean discrepancy and Wasserstein distance are popular distance measures between distributions and play important roles in many machine learning problems such as metric learning, generative modeling, domain adaption, and clustering. However, since they are functions of pair-wise distances between data points in two distributions, they do not exploit the potential manifold properties of data such as smoothness and hence are not effective in measuring the dissimilarity between the two distributions in the form of manifolds. In this paper, different from existing measures, we propose a novel distance called Mutual Regression Distance (MRD) induced by a constrained mutual regression problem, which can exploit the manifold property of data. We prove that MRD is a pseudometric that satisfies almost all the axioms of a metric. Since the optimization of the original MRD is costly, we provide a tight MRD and a simplified MRD, based on which a heuristic algorithm is established. We also provide kernel variants of MRDs that are more effective in handling nonlinear data. Our MRDs especially the simplified MRDs have much lower computational complexity than the Wasserstein distance. We provide theoretical guarantees, such as robustness, for MRDs. Finally, we apply MRDs to distribution clustering, generative models, and domain adaptation. The numerical results demonstrate the effectiveness and superiority of MRDs compared to the baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes