Agnès Desolneux

ML
h-index23
9papers
141citations
Novelty41%
AI Score37

9 Papers

MLJun 29, 2022
Can Push-forward Generative Models Fit Multimodal Distributions?

Antoine Salmona, Valentin de Bortoli, Julie Delon et al.

Many generative models synthesize data by transforming a standard Gaussian random variable using a deterministic neural network. Among these models are the Variational Autoencoders and the Generative Adversarial Networks. In this work, we call them "push-forward" models and study their expressivity. We show that the Lipschitz constant of these generative networks has to be large in order to fit multimodal distributions. More precisely, we show that the total variation distance and the Kullback-Leibler divergence between the generated and the data distribution are bounded from below by a constant depending on the mode separation and the Lipschitz constant. Since constraining the Lipschitz constants of neural networks is a common way to stabilize generative models, there is a provable trade-off between the ability of push-forward models to approximate multimodal distributions and the stability of their training. We validate our findings on one-dimensional and image datasets and empirically show that generative models consisting of stacked networks with stochastic input at each step, such as diffusion models do not suffer of such limitations.

MLOct 17, 2023
Gromov-Wasserstein-like Distances in the Gaussian Mixture Models Space

Antoine Salmona, Julie Delon, Agnès Desolneux

The Gromov-Wasserstein (GW) distance is frequently used in machine learning to compare distributions across distinct metric spaces. Despite its utility, it remains computationally intensive, especially for large-scale problems. Recently, a novel Wasserstein distance specifically tailored for Gaussian mixture models (GMMs) and known as MW2 (mixture Wasserstein) has been introduced by several authors. In scenarios where data exhibit clustering, this approach simplifies to a small-scale discrete optimal transport problem, which complexity depends solely on the number of Gaussian components in the GMMs. This paper aims to incorporate invariance properties into MW2. This is done by introducing new Gromov-type distances, designed to be isometry-invariant in Euclidean spaces and applicable for comparing GMMs across different dimensional spaces. Our first contribution is the Mixture Gromov Wasserstein distance (MGW2), which can be viewed as a "Gromovized" version of MW2. This new distance has a straightforward discrete formulation, making it highly efficient for estimating distances between GMMs in practical applications. To facilitate the derivation of a transport plan between GMMs, we present a second distance, the Embedded Wasserstein distance (EW2). This distance turns out to be closely related to several recent alternatives to Gromov-Wasserstein. We show that EW2 can be adapted to derive a distance as well as optimal transportation plans between GMMs. We demonstrate the efficiency of these newly proposed distances on medium to large-scale problems, including shape matching and hyperspectral image color transfer.

LGJan 28, 2025
Bridging Contrastive Learning and Domain Adaptation: Theoretical Perspective and Practical Application

Gonzalo Iñaki Quintana, Laurence Vancamberg, Vincent Jugnon et al.

This work studies the relationship between Contrastive Learning and Domain Adaptation from a theoretical perspective. The two standard contrastive losses, NT-Xent loss (Self-supervised) and Supervised Contrastive loss, are related to the Class-wise Mean Maximum Discrepancy (CMMD), a dissimilarity measure widely used for Domain Adaptation. Our work shows that minimizing the contrastive losses decreases the CMMD and simultaneously improves class-separability, laying the theoretical groundwork for the use of Contrastive Learning in the context of Domain Adaptation. Due to the relevance of Domain Adaptation in medical imaging, we focused the experiments on mammography images. Extensive experiments on three mammography datasets - synthetic patches, clinical (real) patches, and clinical (real) images - show improved Domain Adaptation, class-separability, and classification performance, when minimizing the Supervised Contrastive loss.

MLSep 12, 2025
Repulsive Monte Carlo on the sphere for the sliced Wasserstein distance

Vladimir Petrovic, Rémi Bardenet, Agnès Desolneux

In this paper, we consider the problem of computing the integral of a function on the unit sphere, in any dimension, using Monte Carlo methods. Although the methods we present are general, our guiding thread is the sliced Wasserstein distance between two measures on $\mathbb{R}^d$, which is precisely an integral on the $d$-dimensional sphere. The sliced Wasserstein distance (SW) has gained momentum in machine learning either as a proxy to the less computationally tractable Wasserstein distance, or as a distance in its own right, due in particular to its built-in alleviation of the curse of dimensionality. There has been recent numerical benchmarks of quadratures for the sliced Wasserstein, and our viewpoint differs in that we concentrate on quadratures where the nodes are repulsive, i.e. negatively dependent. Indeed, negative dependence can bring variance reduction when the quadrature is adapted to the integration task. Our first contribution is to extract and motivate quadratures from the recent literature on determinantal point processes (DPPs) and repelled point processes, as well as repulsive quadratures from the literature specific to the sliced Wasserstein distance. We then numerically benchmark these quadratures. Moreover, we analyze the variance of the UnifOrtho estimator, an orthogonal Monte Carlo estimator. Our analysis sheds light on UnifOrtho's success for the estimation of the sliced Wasserstein in large dimensions, as well as counterexamples from the literature. Our final recommendation for the computation of the sliced Wasserstein distance is to use randomized quasi-Monte Carlo in low dimensions and \emph{UnifOrtho} in large dimensions. DPP-based quadratures only shine when quasi-Monte Carlo also does, while repelled quadratures show moderate variance reduction in general, but more theoretical effort is needed to make them robust.

LGSep 2, 2025
Differentiable Expectation-Maximisation and Applications to Gaussian Mixture Model Optimal Transport

Samuel Boïté, Eloi Tanguy, Julie Delon et al.

The Expectation-Maximisation (EM) algorithm is a central tool in statistics and machine learning, widely used for latent-variable models such as Gaussian Mixture Models (GMMs). Despite its ubiquity, EM is typically treated as a non-differentiable black box, preventing its integration into modern learning pipelines where end-to-end gradient propagation is essential. In this work, we present and compare several differentiation strategies for EM, from full automatic differentiation to approximate methods, assessing their accuracy and computational efficiency. As a key application, we leverage this differentiable EM in the computation of the Mixture Wasserstein distance $\mathrm{MW}_2$ between GMMs, allowing $\mathrm{MW}_2$ to be used as a differentiable loss in imaging and machine learning tasks. To complement our practical use of $\mathrm{MW}_2$, we contribute a novel stability result which provides theoretical justification for the use of $\mathrm{MW}_2$ with EM, and also introduce a novel unbalanced variant of $\mathrm{MW}_2$. Numerical experiments on barycentre computation, colour and style transfer, image generation, and texture synthesis illustrate the versatility of the proposed approach in different settings.

PROct 25, 2021
On quantitative Laplace-type convergence results for some exponential probability measures, with two applications

Valentin De Bortoli, Agnès Desolneux

Laplace-type results characterize the limit of sequence of measures $(π_\varepsilon)_{\varepsilon >0}$ with density w.r.t the Lebesgue measure $(\mathrm{d} π_\varepsilon / \mathrm{d} \mathrm{Leb})(x) \propto \exp[-U(x)/\varepsilon]$ when the temperature $\varepsilon>0$ converges to $0$. If a limiting distribution $π_0$ exists, it concentrates on the minimizers of the potential $U$. Classical results require the invertibility of the Hessian of $U$ in order to establish such asymptotics. In this work, we study the particular case of norm-like potentials $U$ and establish quantitative bounds between $π_\varepsilon$ and $π_0$ w.r.t. the Wasserstein distance of order $1$ under an invertibility condition of a generalized Jacobian. One key element of our proof is the use of geometric measure theory tools such as the coarea formula. We apply our results to the study of maximum entropy models (microcanonical/macrocanonical distributions) and to the convergence of the iterates of the Stochastic Gradient Langevin Dynamics (SGLD) algorithm at low temperatures for non-convex minimization.

CVJul 30, 2020
Unidentified Floating Object detection in maritime environment using dictionary learning

Darshan Venkatrayappa, Agnès Desolneux, Jean-Michel Hubert et al.

Maritime domain is one of the most challenging scenarios for object detection due to the complexity of the observed scene. In this article, we present a new approach to detect unidentified floating objects in the maritime environment. The proposed approach is capable of detecting floating objects without any prior knowledge of their visual appearance, shape or location. The input image from the video stream is denoised using a visual dictionary learned from a K-SVD algorithm. The denoised image is made of self-similar content. Later, we extract the residual image, which is the difference between the original image and the denoised (self-similar) image. Thus, the residual image contains noise and salient structures (objects). These salient structures can be extracted using an a contrario model. We demonstrate the capabilities of our algorithm by testing it on videos exhibiting varying maritime scenarios.

MLFeb 23, 2018
Exact Sampling of Determinantal Point Processes without Eigendecomposition

Claire Launay, Bruno Galerne, Agnès Desolneux

Determinantal point processes (DPPs) enable the modeling of repulsion: they provide diverse sets of points. The repulsion is encoded in a kernel $K$ that can be seen as a matrix storing the similarity between points. The diversity comes from the fact that the inclusion probability of a subset is equal to the determinant of a submatrice of $K$. The exact algorithm to sample DPPs uses the spectral decomposition of $K$, a computation that becomes costly when dealing with a high number of points. Here, we present an alternative exact algorithm in the discrete setting that avoids the eigenvalues and the eigenvectors computation. Instead, it relies on Cholesky decompositions. This is a two steps strategy: first, it samples a Bernoulli point process with an appropriate distribution, then it samples the target DPP distribution through a thinning procedure. Not only is the method used here innovative, but this algorithm can be competitive with the original algorithm or even faster for some applications specified here.

CVJul 22, 2017
A survey of exemplar-based texture synthesis

Lara Raad, Axel Davy, Agnès Desolneux et al.

Exemplar-based texture synthesis is the process of generating, from an input sample, new texture images of arbitrary size and which are perceptually equivalent to the sample. The two main approaches are statistics-based methods and patch re-arrangement methods. In the first class, a texture is characterized by a statistical signature; then, a random sampling conditioned to this signature produces genuinely different texture images. The second class boils down to a clever "copy-paste" procedure, which stitches together large regions of the sample. Hybrid methods try to combine ideas from both approaches to avoid their hurdles. The recent approaches using convolutional neural networks fit to this classification, some being statistical and others performing patch re-arrangement in the feature space. They produce impressive synthesis on various kinds of textures. Nevertheless, we found that most real textures are organized at multiple scales, with global structures revealed at coarse scales and highly varying details at finer ones. Thus, when confronted with large natural images of textures the results of state-of-the-art methods degrade rapidly, and the problem of modeling them remains wide open.