NAMar 2, 2018
Differentiation and regularity of semi-discrete optimal transport with respect to the parameters of the discrete measureFrédéric de Gournay, Jonas Kahn, Léo Lebrat
This paper aims at determining under which conditions the semi-discrete optimal transport is twice differentiable with respect to the parameters of the discrete measure and exhibits numerical applications. The discussion focuses on minimal conditions on the background measure to ensure differentiability. We provide numerical illustrations in stippling and blue noise problems.
NAFeb 2, 2019
Optimal Transport Approximation of 2-Dimensional MeasuresFrédéric de Gournay, Jonas Kahn, Léo Lebrat et al.
We propose a fast and scalable algorithm to project a given density on a set of structured measures defined over a compact 2D domain. The measures can be discrete or supported on curves for instance. The proposed principle and algorithm are a natural generalization of previous results revolving around the generation of blue-noise point distributions, such as Lloyd's algorithm or more advanced techniques based on power diagrams. We analyze the convergence properties and propose new approaches to accelerate the generation of point distributions. We also design new algorithms to project curves onto spaces of curves with bounded length and curvature or speed and acceleration. We illustrate the algorithm's interest through applications in advanced sampling theory, non-photorealistic rendering and path planning.
NAJun 25, 2018
3/4-discrete optimal transportFrédéric de Gournay, Jonas Kahn, Léo Lebrat
This paper deals with the 3/4-discrete 2-Wasserstein optimal transport between two measures, where one is supported by a set of segment and the other one is supported by a set of Dirac masses. We select the most suitable optimization procedure that computes the optimal transport and provide numerical examples of approximation of cloud data by segments.
MLSep 15, 2016
Recursive nearest agglomeration (ReNA): fast clustering for approximation of structured signalsAndrés Hoyos-Idrobo, Gaël Varoquaux, Jonas Kahn et al.
In this work, we revisit fast dimension reduction approaches, as with random projections and random sampling. Our goal is to summarize the data to decrease computational costs and memory footprint of subsequent analysis. Such dimension reduction can be very efficient when the signals of interest have a strong structure, such as with images. We focus on this setting and investigate feature clustering schemes for data reductions that capture this structure. An impediment to fast dimension reduction is that good clustering comes with large algorithmic costs. We address it by contributing a linear-time agglomerative clustering scheme, Recursive Nearest Agglomeration (ReNA). Unlike existing fast agglomerative schemes, it avoids the creation of giant clusters. We empirically validate that it approximates the data as well as traditional variance-minimizing clustering schemes that have a quadratic complexity. In addition, we analyze signal approximation with feature clustering and show that it can remove noise, improving subsequent analysis steps. As a consequence, data reduction by clustering features with ReNA yields very fast and accurate models, enabling to process large datasets on budget. Our theoretical analysis is backed by extensive experiments on publicly-available data that illustrate the computation efficiency and the denoising properties of the resulting dimension reduction scheme.
MLNov 16, 2015
Fast clustering for scalable statistical analysis on structured imagesBertrand Thirion, Andrés Hoyos-Idrobo, Jonas Kahn et al.
The use of brain images as markers for diseases or behavioral differences is challenged by the small effects size and the ensuing lack of power, an issue that has incited researchers to rely more systematically on large cohorts. Coupled with resolution increases, this leads to very large datasets. A striking example in the case of brain imaging is that of the Human Connectome Project: 20 Terabytes of data and growing. The resulting data deluge poses severe challenges regarding the tractability of some processing steps (discriminant analysis, multivariate models) due to the memory demands posed by these data. In this work, we revisit dimension reduction approaches, such as random projections, with the aim of replacing costly function evaluations by cheaper ones while decreasing the memory requirements. Specifically, we investigate the use of alternate schemes, based on fast clustering, that are well suited for signals exhibiting a strong spatial structure, such as anatomical and functional brain images. Our contribution is twofold: i) we propose a linear-time clustering scheme that bypasses the percolation issues inherent in these algorithms and thus provides compressions nearly as good as traditional quadratic-complexity variance-minimizing clustering schemes, ii) we show that cluster-based compression can have the virtuous effect of removing high-frequency noise, actually improving subsequent estimations steps. As a consequence, the proposed approach yields very accurate models on several large-scale problems yet with impressive gains in computational efficiency, making it possible to analyze large datasets.
NASep 1, 2015
A projection algorithm on measures setsNicolas Chauffert, Philippe Ciuciu, Jonas Kahn et al.
We consider the problem of projecting a probability measure $π$ on a set $\mathcal{M}\_N$ of Radon measures. The projection is defined as a solution of the following variational problem:\begin{equation*}\inf\_{μ\in \mathcal{M}\_N} \|h\star (μ- π)\|\_2^2,\end{equation*}where $h\in L^2(Ω)$ is a kernel, $Ω\subset \R^d$ and $\star$ denotes the convolution operator.To motivate and illustrate our study, we show that this problem arises naturally in various practical image rendering problems such as stippling (representing an image with $N$ dots) or continuous line drawing (representing an image with a continuous line).We provide a necessary and sufficient condition on the sequence $(\mathcal{M}\_N)\_{N\in \N}$ that ensures weak convergence of the projections $(μ^*\_N)\_{N\in \N}$ to $π$.We then provide a numerical algorithm to solve a discretized version of the problem and show several illustrations related to computer-assisted synthesis of artistic paintings/drawings.