Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps
This work addresses the challenge of interpretability in high-dimensional optimal transport for applications like genomics, offering a method to retain gene-level insights, though it is incremental in building on existing entropic map approaches.
The authors tackled the problem of estimating interpretable optimal transport maps in high-dimensional spaces by proposing a new model that uses sparsity-inducing regularizers to produce feature-sparse displacement vectors, enabling meaningful analysis of single-cell transcription data in a 34,000-dimensional gene count space without dimensionality reduction.
Optimal transport (OT) theory focuses, among all maps $T:\mathbb{R}^d\rightarrow \mathbb{R}^d$ that can morph a probability measure onto another, on those that are the ``thriftiest'', i.e. such that the averaged cost $c(x, T(x))$ between $x$ and its image $T(x)$ be as small as possible. Many computational approaches have been proposed to estimate such Monge maps when $c$ is the $\ell_2^2$ distance, e.g., using entropic maps [Pooladian'22], or neural networks [Makkuva'20, Korotin'20]. We propose a new model for transport maps, built on a family of translation invariant costs $c(x, y):=h(x-y)$, where $h:=\tfrac{1}{2}\|\cdot\|_2^2+τ$ and $τ$ is a regularizer. We propose a generalization of the entropic map suitable for $h$, and highlight a surprising link tying it with the Bregman centroids of the divergence $D_h$ generated by $h$, and the proximal operator of $τ$. We show that choosing a sparsity-inducing norm for $τ$ results in maps that apply Occam's razor to transport, in the sense that the displacement vectors $Δ(x):= T(x)-x$ they induce are sparse, with a sparsity pattern that varies depending on $x$. We showcase the ability of our method to estimate meaningful OT maps for high-dimensional single-cell transcription data, in the $34000$-$d$ space of gene counts for cells, without using dimensionality reduction, thus retaining the ability to interpret all displacements at the gene level.