Sebastian Neumayer

LG
h-index21
18papers
336citations
Novelty47%
AI Score52

18 Papers

IVAug 21, 2023
Learning Weakly Convex Regularizers for Convergent Image-Reconstruction Algorithms

Alexis Goujon, Sebastian Neumayer, Michael Unser

We propose to learn non-convex regularizers with a prescribed upper bound on their weak-convexity modulus. Such regularizers give rise to variational denoisers that minimize a convex energy. They rely on few parameters (less than 15,000) and offer a signal-processing interpretation as they mimic handcrafted sparsity-promoting regularizers. Through numerical experiments, we show that such denoisers outperform convex-regularization methods as well as the popular BM3D denoiser. Additionally, the learned regularizer can be deployed to solve inverse problems with iterative schemes that provably converge. For both CT and MRI reconstruction, the regularizer generalizes well and offers an excellent tradeoff between performance, number of parameters, guarantees, and interpretability when compared to other data-driven approaches.

IVNov 22, 2022
A Neural-Network-Based Convex Regularizer for Inverse Problems

Alexis Goujon, Sebastian Neumayer, Pakshal Bohra et al.

The emergence of deep-learning-based methods to solve image-reconstruction problems has enabled a significant increase in reconstruction quality. Unfortunately, these new methods often lack reliability and explainability, and there is a growing interest to address these shortcomings while retaining the boost in performance. In this work, we tackle this issue by revisiting regularizers that are the sum of convex-ridge functions. The gradient of such regularizers is parameterized by a neural network that has a single hidden layer with increasing and learnable activation functions. This neural network is trained within a few minutes as a multistep Gaussian denoiser. The numerical experiments for denoising, CT, and MRI reconstruction show improvements over methods that offer similar reliability guarantees.

LGOct 28, 2022
Improving Lipschitz-Constrained Neural Networks by Learning Activation Functions

Stanislas Ducotterd, Alexis Goujon, Pakshal Bohra et al.

Lipschitz-constrained neural networks have several advantages over unconstrained ones and can be applied to a variety of problems, making them a topic of attention in the deep learning community. Unfortunately, it has been shown both theoretically and empirically that they perform poorly when equipped with ReLU activation functions. By contrast, neural networks with learnable 1-Lipschitz linear splines are known to be more expressive. In this paper, we show that such networks correspond to global optima of a constrained functional optimization problem that consists of the training of a neural network composed of 1-Lipschitz linear layers and 1-Lipschitz freeform activation functions with second-order total-variation regularization. Further, we propose an efficient method to train these neural networks. Our numerical experiments show that our trained networks compare favorably with existing 1-Lipschitz neural architectures.

LGApr 13, 2022
Approximation of Lipschitz Functions using Deep Spline Neural Networks

Sebastian Neumayer, Alexis Goujon, Pakshal Bohra et al.

Lipschitz-constrained neural networks have many applications in machine learning. Since designing and training expressive Lipschitz-constrained networks is very challenging, there is a need for improved methods and a better theoretical understanding. Unfortunately, it turns out that ReLU networks have provable disadvantages in this setting. Hence, we propose to use learnable spline activation functions with at least 3 linear regions instead. We prove that this choice is optimal among all component-wise $1$-Lipschitz activation functions in the sense that no other weight constrained architecture can approximate a larger class of functions. Additionally, this choice is at least as expressive as the recently introduced non component-wise Groupsort activation function for spectral-norm-constrained weights. Previously published numerical results support our theoretical findings.

OCJun 14, 2022
Stability of Image-Reconstruction Algorithms

Pol del Aguila Pla, Sebastian Neumayer, Michael Unser

Robustness and stability of image-reconstruction algorithms have recently come under scrutiny. Their importance to medical imaging cannot be overstated. We review the known results for the topical variational regularization strategies ($\ell_2$ and $\ell_1$ regularization) and present novel stability results for $\ell_p$-regularized linear inverse problems for $p\in(1,\infty)$. Our results guarantee Lipschitz continuity for small $p$ and Hölder continuity for larger $p$. They generalize well to the $L_p(Ω)$ function spaces.

LGMar 31, 2023
On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks

Sebastian Neumayer, Lénaïc Chizat, Michael Unser

In supervised learning, the regularization path is sometimes used as a convenient theoretical proxy for the optimization path of gradient descent initialized from zero. In this paper, we study a modification of the regularization path for infinite-width 2-layer ReLU neural networks with nonzero initial distribution of the weights at different scales. By exploiting a link with unbalanced optimal-transport theory, we show that, despite the non-convexity of the 2-layer network training, this problem admits an infinite-dimensional convex counterpart. We formulate the corresponding functional-optimization problem and investigate its main properties. In particular, we show that, as the scale of the initialization ranges between $0$ and $+\infty$, the associated path interpolates continuously between the so-called kernel and rich regimes. Numerical experiments confirm that, in our setting, the scaling path and the final states of the optimization path behave similarly, even beyond these extreme points.

IVMay 11
A Stability Benchmark of Generative Regularizers for Inverse Problems

Alexander Denker, Johannes Hertrich, Sebastian Neumayer

Generative (diffusion) priors demonstrate remarkable performance in addressing inverse problems in imaging. Yet, for scientific and medical imaging, it is crucial that reconstruction techniques remain stable and reliable under imperfect settings. Typical definitions of stability encompass the notion of ''convergent regularization'', robustness to out-of-distribution data, and to inaccuracies in the forward operator or noise model. We evaluate these properties numerically. Furthermore, we benchmark generative approaches against modern optimization-based methods inspired by the widely used variational techniques. Our results give insights for which settings and applications generative priors can deliver state-of-the-art reconstructions, and on those in which they fall short or may even be problematic.

IVJul 9, 2024
Iteratively Refined Image Reconstruction with Learned Attentive Regularizers

Mehrsa Pourya, Sebastian Neumayer, Michael Unser

We propose a regularization scheme for image reconstruction that leverages the power of deep learning while hinging on classic sparsity-promoting models. Many deep-learning-based models are hard to interpret and cumbersome to analyze theoretically. In contrast, our scheme is interpretable because it corresponds to the minimization of a series of convex problems. For each problem in the series, a mask is generated based on the previous solution to refine the regularization strength spatially. In this way, the model becomes progressively attentive to the image structure. For the underlying update operator, we prove the existence of a fixed point. As a special case, we investigate a mask generator for which the fixed-point iterations converge to a critical point of an explicit energy functional. In our experiments, we match the performance of state-of-the-art learned variational models for the solution of inverse problems. Additionally, we offer a promising balance between interpretability, theoretical guarantees, reliability, and performance.

CVMar 28
Weakly Convex Ridge Regularization for 3D Non-Cartesian MRI Reconstruction

German Shâma Wache, Chaithya G R, Asma Tanabene et al.

While highly accelerated non-Cartesian acquisition protocols significantly reduce scan time, they often entail long reconstruction delays. Deep learning based reconstruction methods can alleviate this, but often lack stability and robustness to distribution shifts. As an alternative, we train a rotation invariant weakly convex ridge regularizer (WCRR). The resulting variational reconstruction approach is benchmarked against state of the art methods on retrospectively simulated data and (out of distribution) on prospective GoLF SPARKLING and CAIPIRINHA acquisitions. Our approach consistently outperforms widely used baselines and achieves performance comparable to Plug and Play reconstruction with a state of the art 3D DRUNet denoiser, while offering substantially improved computational efficiency and robustness to acquisition changes. In summary, WCRR unifies the strengths of principled variational methods and modern deep learning based approaches.

MLFeb 7, 2024
Wasserstein Gradient Flows for Moreau Envelopes of f-Divergences in Reproducing Kernel Hilbert Spaces

Viktor Stein, Sebastian Neumayer, Nicolaj Rux et al.

Commonly used $f$-divergences of measures, e.g., the Kullback-Leibler divergence, are subject to limitations regarding the support of the involved measures. A remedy is regularizing the $f$-divergence by a squared maximum mean discrepancy (MMD) associated with a characteristic kernel $K$. We use the kernel mean embedding to show that this regularization can be rewritten as the Moreau envelope of some function on the associated reproducing kernel Hilbert space. Then, we exploit well-known results on Moreau envelopes in Hilbert spaces to analyze the MMD-regularized $f$-divergences, particularly their gradients. Subsequently, we use our findings to analyze Wasserstein gradient flows of MMD-regularized $f$-divergences. We provide proof-of-the-concept numerical examples for flows starting from empirical measures. Here, we cover $f$-divergences with infinite and finite recession constants. Lastly, we extend our results to the tight variational formulation of $f$-divergences and numerically compare the resulting flows.

IVFeb 6, 2025
DEALing with Image Reconstruction: Deep Attentive Least Squares

Mehrsa Pourya, Erich Kobler, Michael Unser et al.

State-of-the-art image reconstruction often relies on complex, highly parameterized deep architectures. We propose an alternative: a data-driven reconstruction method inspired by the classic Tikhonov regularization. Our approach iteratively refines intermediate reconstructions by solving a sequence of quadratic problems. These updates have two key components: (i) learned filters to extract salient image features, and (ii) an attention mechanism that locally adjusts the penalty of filter responses. Our method achieves performance on par with leading plug-and-play and learned regularizer approaches while offering interpretability, robustness, and convergent behavior. In effect, we bridge traditional regularization and deep learning with a principled reconstruction approach.

LGOct 2, 2025
Learning Regularization Functionals for Inverse Problems: A Comparative Study

Johannes Hertrich, Hok Shing Wong, Alexander Denker et al.

In recent years, a variety of learned regularization frameworks for solving inverse problems in imaging have emerged. These offer flexible modeling together with mathematical insights. The proposed methods differ in their architectural design and training strategies, making direct comparison challenging due to non-modular implementations. We address this gap by collecting and unifying the available code into a common framework. This unified view allows us to systematically compare the approaches and highlight their strengths and limitations, providing valuable insights into their future potential. We also provide concise descriptions of each method, complemented by practical guidelines.

IVDec 17, 2024
Learning of Patch-Based Smooth-Plus-Sparse Models for Image Reconstruction

Stanislas Ducotterd, Sebastian Neumayer, Michael Unser

We aim at the solution of inverse problems in imaging, by combining a penalized sparse representation of image patches with an unconstrained smooth one. This allows for a straightforward interpretation of the reconstruction. We formulate the optimization as a bilevel problem. The inner problem deploys classical algorithms while the outer problem optimizes the dictionary and the regularizer parameters through supervised learning. The process is carried out via implicit differentiation and gradient-based optimization. We evaluate our method for denoising, super-resolution, and compressed-sensing magnetic-resonance imaging. We compare it to other classical models as well as deep-learning-based methods and show that it always outperforms the former and also the latter in some instances.

LGNov 11, 2024
Generative Feature Training of Thin 2-Layer Networks

Johannes Hertrich, Sebastian Neumayer

We consider the approximation of functions by 2-layer neural networks with a small number of hidden weights based on the squared loss and small datasets. Due to the highly non-convex energy landscape, gradient-based training often suffers from local minima. As a remedy, we initialize the hidden weights with samples from a learned proposal distribution, which we parameterize as a deep generative model. To train this model, we exploit the fact that with fixed hidden weights, the optimal output weights solve a linear equation. After learning the generative model, we refine the sampled weights with a gradient-based post-processing in the latent space. Here, we also include a regularization scheme to counteract potential noise. Finally, we demonstrate the effectiveness of our approach by numerical examples.

NAApr 1
A remark on an error analysis for classical and learned Tikhonov regularization schemes

Arne Behrens, Meira Iske, Ming Jiang et al.

This paper presents an error analysis of classical and learned Tikhonov regularization schemes for inverse problems. We first demonstrate, both theoretically and numerically, that using a fixed regularization parameter across varying noise levels-which is a common miss-specification in practice-has only a mild impact on the reconstruction error. As a special case, we then investigate scenarios where the true data resides in an unknown finite-dimensional subspace. Here, our results lead to an empirically supported strategy for estimating the unknown dimension based on numerical experiments. Finally, we examine the approach that motivated this study: a method where a sparsity-promoting term is learned from denoising tasks and subsequently applied to general inverse problems via a simple heuristic parameter selection. The corresponding error analysis is initially developed using classical concepts and subsequently refined through a more detailed investigation of the discretized setting.

OCJun 18, 2024
Stability of Data-Dependent Ridge-Regularization for Inverse Problems

Sebastian Neumayer, Fabian Altekrüger

Theoretical guarantees for the robust solution of inverse problems have important implications for applications. To achieve both guarantees and high reconstruction quality, we propose learning a pixel-based ridge regularizer with a data-dependent and spatially varying regularization strength. For this architecture, we establish the existence of solutions to the associated variational problem and the stability of its solution operator. Further, we prove that the reconstruction forms a maximum-a-posteriori approach. Simulations for biomedical imaging and material sciences demonstrate that the approach yields high-quality reconstructions even if only a small instance-specific training set is available.

OCNov 4, 2020
Convolutional Proximal Neural Networks and Plug-and-Play Algorithms

Johannes Hertrich, Sebastian Neumayer, Gabriele Steidl

In this paper, we introduce convolutional proximal neural networks (cPNNs), which are by construction averaged operators. For filters of full length, we propose a stochastic gradient descent algorithm on a submanifold of the Stiefel manifold to train cPNNs. In case of filters with limited length, we design algorithms for minimizing functionals that approximate the orthogonality constraints imposed on the operators by penalizing the least squares distance to the identity operator. Then, we investigate how scaled cPNNs with a prescribed Lipschitz constant can be used for denoising signals and images, where the achieved quality depends on the Lipschitz constant. Finally, we apply cPNN based denoisers within a Plug-and-Play (PnP) framework and provide convergence results for the corresponding PnP forward-backward splitting algorithm based on an oracle construction.

LGSep 7, 2020
Stabilizing Invertible Neural Networks Using Mixture Models

Paul Hagemann, Sebastian Neumayer

In this paper, we analyze the properties of invertible neural networks, which provide a way of solving inverse problems. Our main focus lies on investigating and controlling the Lipschitz constants of the corresponding inverse networks. Without such an control, numerical simulations are prone to errors and not much is gained against traditional approaches. Fortunately, our analysis indicates that changing the latent distribution from a standard normal one to a Gaussian mixture model resolves the issue of exploding Lipschitz constants. Indeed, numerical simulations confirm that this modification leads to significantly improved sampling quality in multimodal applications.