Emmanuel Soubies

h-index10

8papers

632citations

Novelty54%

AI Score39

Ranked #76,537 of 194,257 authors (top 39%)#17,083 in LG (top 43%)

8 Papers

2.3NANov 15, 2018

The Sliding Frank-Wolfe Algorithm and its Application to Super-Resolution Microscopy

Quentin Denoyelle, Vincent Duval, Gabriel Peyré et al.

This paper showcases the theoretical and numerical performance of the Sliding Frank-Wolfe, which is a novel optimization algorithm to solve the BLASSO sparse spikes super-resolution problem. The BLASSO is a continuous (i.e. off-the-grid or grid-less) counterpart to the well-known 1 sparse regularisation method (also known as LASSO or Basis Pursuit). Our algorithm is a variation on the classical Frank-Wolfe (also known as conditional gradient) which follows a recent trend of interleaving convex optimization updates (corresponding to adding new spikes) with non-convex optimization steps (corresponding to moving the spikes). Our main theoretical result is that this algorithm terminates in a finite number of steps under a mild non-degeneracy hypothesis. We then target applications of this method to several instances of single molecule fluorescence imaging modalities, among which certain approaches rely heavily on the inversion of a Laplace transform. Our second theoretical contribution is the proof of the exact support recovery property of the BLASSO to invert the 1-D Laplace transform in the case of positive spikes. On the numerical side, we conclude this paper with an extensive study of the practical performance of the Sliding Frank-Wolfe on different instantiations of single molecule fluorescence imaging, including convolutive and non-convolutive (Laplace-like) operators. This shows the versatility and superiority of this method with respect to alternative sparse recovery technics.

3.3SPJul 12, 2023

Deep Unrolling for Nonconvex Robust Principal Component Analysis

Elizabeth Z. C. Tan, Caroline Chaux, Emmanuel Soubies et al.

We design algorithms for Robust Principal Component Analysis (RPCA) which consists in decomposing a matrix into the sum of a low rank matrix and a sparse matrix. We propose a deep unrolled algorithm based on an accelerated alternating projection algorithm which aims to solve RPCA in its nonconvex form. The proposed procedure combines benefits of deep neural networks and the interpretability of the original algorithm and it automatically learns hyperparameters. We demonstrate the unrolled algorithm's effectiveness on synthetic datasets and also on a face modeling problem, where it leads to both better numerical and visual performances.

2.5OCJul 6

On The Linear Convergence of Bregman Proximal Gradient Methods with Applications to Kullback--Leibler regression

Jonathan Chirinos-Rodríguez, Christian Daniele, Cédric Févotte et al.

Bregman Proximal Gradient methods (BPGM) exploit the underlying geometry of the objective function through a carefully chosen mirror map. In this work, we introduce a novel notion of strong convexity, termed Restricted Relative Strong Convexity, and establish linear convergence rates for BPGM under this condition. We then exploit the proposed theoretical framework to provide an in-depth analysis of the convergence of BPGM for (regularized) Kullback--Leibler regression problems, covering scenarios with both unique and non-unique minimizers, as well as regularized and unregularized formulations. Specifically, we demonstrate that using the popular Burg's entropy as a distance-generating function may only yield linear convergence for certain KL regression problems. In contrast, we show that employing a smoothed version of the Burg's entropy induces the suitable geometry required to guarantee linear convergence. We conclude with numerical experiments that nicely align with our theoretical findings.

14.4LGJan 6, 2025

Convexity in ReLU Neural Networks: beyond ICNNs?

Anne Gagneux, Mathurin Massias, Emmanuel Soubies et al.

Convex functions and their gradients play a critical role in mathematical imaging, from proximal optimization to Optimal Transport. The successes of deep learning has led many to use learning-based methods, where fixed functions or operators are replaced by learned neural networks. Regardless of their empirical superiority, establishing rigorous guarantees for these methods often requires to impose structural constraints on neural architectures, in particular convexity. The most popular way to do so is to use so-called Input Convex Neural Networks (ICNNs). In order to explore the expressivity of ICNNs, we provide necessary and sufficient conditions for a ReLU neural network to be convex. Such characterizations are based on product of weights and activations, and write nicely for any architecture in the path-lifting framework. As particular applications, we study our characterizations in depth for 1 and 2-hidden-layer neural networks: we show that every convex function implemented by a 1-hidden-layer ReLU network can be also expressed by an ICNN with the same architecture; however this property no longer holds with more layers. Finally, we provide a numerical procedure that allows an exact check of convexity for ReLU neural networks with a large number of affine regions.

4.1OCJun 18, 2025

Proximal Operators of Sorted Nonconvex Penalties

Anne Gagneux, Mathurin Massias, Emmanuel Soubies

This work studies the problem of sparse signal recovery with automatic grouping of variables. To this end, we investigate sorted nonsmooth penalties as a regularization approach for generalized linear models. We focus on a family of sorted nonconvex penalties which generalizes the Sorted L1 Norm (SLOPE). These penalties are designed to promote clustering of variables due to their sorted nature, while the nonconvexity reduces the shrinkage of coefficients. Our goal is to provide efficient ways to compute their proximal operator, enabling the use of popular proximal algorithms to solve composite optimization problems with this choice of sorted penalties. We distinguish between two classes of problems: the weakly convex case where computing the proximal operator remains a convex problem, and the nonconvex case where computing the proximal operator becomes a challenging nonconvex combinatorial problem. For the weakly convex case (e.g. sorted MCP and SCAD), we explain how the Pool Adjacent Violators (PAV) algorithm can exactly compute the proximal operator. For the nonconvex case (e.g. sorted Lq with q in ]0,1[), we show that a slight modification of this algorithm turns out to be remarkably efficient to tackle the computation of the proximal operator. We also present new theoretical insights on the minimizers of the nonconvex proximal problem. We demonstrate the practical interest of using such penalties on several experiments.

1.8LGFeb 15, 2022Code

Accelerating Non-Negative and Bounded-Variable Linear Regression Algorithms with Safe Screening

Cassio F. Dantas, Emmanuel Soubies, Cédric Févotte

Non-negative and bounded-variable linear regression problems arise in a variety of applications in machine learning and signal processing. In this paper, we propose a technique to accelerate existing solvers for these problems by identifying saturated coordinates in the course of iterations. This is akin to safe screening techniques previously proposed for sparsity-regularized regression problems. The proposed strategy is provably safe as it provides theoretical guarantees that the identified coordinates are indeed saturated in the optimal solution. Experimental results on synthetic and real data show compelling accelerations for both non-negative and bounded-variable problems.

1.6LGDec 10, 2021Code

Leveraging Joint-Diagonalization in Transform-Learning NMF

Sixin Zhang, Emmanuel Soubies, Cédric Févotte

Non-negative matrix factorization with transform learning (TL-NMF) is a recent idea that aims at learning data representations suited to NMF. In this work, we relate TL-NMF to the classical matrix joint-diagonalization (JD) problem. We show that, when the number of data realizations is sufficiently large, TL-NMF can be replaced by a two-step approach -- termed as JD+NMF -- that estimates the transform through JD, prior to NMF computation. In contrast, we found that when the number of data realizations is limited, not only is JD+NMF no longer equivalent to TL-NMF, but the inherent low-rank constraint of TL-NMF turns out to be an essential ingredient to learn meaningful transforms for NMF.

7.5LGFeb 22, 2021Code

Expanding boundaries of Gap Safe screening

Cassio F. Dantas, Emmanuel Soubies, Cédric Févotte

Sparse optimization problems are ubiquitous in many fields such as statistics, signal/image processing and machine learning. This has led to the birth of many iterative algorithms to solve them. A powerful strategy to boost the performance of these algorithms is known as safe screening: it allows the early identification of zero coordinates in the solution, which can then be eliminated to reduce the problem's size and accelerate convergence. In this work, we extend the existing Gap Safe screening framework by relaxing the global strong-concavity assumption on the dual cost function. Instead, we exploit local regularity properties, that is, strong concavity on well-chosen subsets of the domain. The non-negativity constraint is also integrated to the existing framework. Besides making safe screening possible to a broader class of functions that includes beta-divergences (e.g., the Kullback-Leibler divergence), the proposed approach also improves upon the existing Gap Safe screening rules on previously applicable cases (e.g., logistic regression). The proposed general framework is exemplified by some notable particular cases: logistic function, beta = 1.5 and Kullback-Leibler divergences. Finally, we showcase the effectiveness of the proposed screening rules with different solvers (coordinate descent, multiplicative-update and proximal gradient algorithms) and different data sets (binary classification, hyperspectral and count data).