35.4NAMay 22
Ergodicity of Langevin Dynamics and its Discretizations for Non-smooth PotentialsLorenz Fruehwirth, Andreas Habring
This article is concerned with sampling from Gibbs distributions $π(x)\propto e^{-U(x)}$ using Markov chain Monte Carlo methods. In particular, we investigate Langevin dynamics in the continuous- and the discrete-time setting for such distributions with potentials $U(x)$ which are strongly-convex but possibly non-differentiable. We show that the corresponding subgradient Langevin dynamics are exponentially ergodic to the target density $π$ in the continuous setting and that certain explicit as well as semi-implicit discretizations are geometrically ergodic and approximate $π$ for vanishing discretization step size. Moreover, we prove that the discrete schemes satisfy the law of large numbers allowing to use consecutive iterates of a Markov chain in order to compute statistics of the stationary distribution posing a significant reduction of computational complexity in practice. Numerical experiments are provided confirming the theoretical findings and showcasing the practical relevance of the proposed methods in imaging applications.
CVDec 23, 2022
Posterior-Variance-Based Error Quantification for Inverse Problems in ImagingDominik Narnhofer, Andreas Habring, Martin Holler et al.
In this work, a method for obtaining pixel-wise error bounds in Bayesian regularization of inverse imaging problems is introduced. The proposed method employs estimates of the posterior variance together with techniques from conformal prediction in order to obtain coverage guarantees for the error bounds, without making any assumption on the underlying data distribution. It is generally applicable to Bayesian regularization approaches, independent, e.g., of the concrete choice of the prior. Furthermore, the coverage guarantees can also be obtained in case only approximate sampling from the posterior is possible. With this in particular, the proposed framework is able to incorporate any learned prior in a black-box manner. Guaranteed coverage without assumptions on the underlying distributions is only achievable since the magnitude of the error bounds is, in general, unknown in advance. Nevertheless, experiments with multiple regularization approaches presented in the paper confirm that in practice, the obtained error bounds are rather tight. For realizing the numerical experiments, also a novel primal-dual Langevin algorithm for sampling from non-smooth distributions is introduced in this work.
69.6NAMay 11
Forward-KL Convergence of Time-Inhomogeneous Langevin DiffusionsAndreas Habring, Martin Zach
Many practical samplers rely on time-dependent drifts -- often induced by annealing or tempering schedules -- to improve exploration and stability. This motivates a unified non-asymptotic analysis of the corresponding Langevin diffusions and their discretizations. We provide a convergence analysis that includes non-asymptotic bounds for the continuous-time diffusion and its Euler--Maruyama discretization in the forward-Kullback--Leibler divergence under a single set of abstract conditions on the time-dependent drift. The results apply to many practically-relevant annealing schemes, including geometric tempering and annealed Langevin sampling. In addition, we provide numerical experiments comparing the annealing schemes covered by our theory in low- and high-dimensional settings.
CVApr 22, 2022
A Note on the Regularity of Images Generated by Convolutional Neural NetworksAndreas Habring, Martin Holler
The regularity of images generated by convolutional neural networks, such as the U-net, generative networks, or the deep image prior, is analyzed. In a resolution-independent, infinite dimensional setting, it is shown that such images, represented as functions, are always continuous and, in some circumstances, even continuously differentiable, contradicting the widely accepted modeling of sharp edges in images via jump discontinuities. While such statements require an infinite dimensional setting, the connection to (discretized) neural networks used in practice is made by considering the limit as the resolution approaches infinity. As practical consequence, the results of this paper in particular provide analytical evidence that basic L2 regularization of network weights might lead to over-smoothed outputs.
OCJul 20, 2022
Unsupervised energy disaggregation via convolutional sparse codingChristian Aarset, Andreas Habring, Martin Holler et al.
In this work, a method for unsupervised energy disaggregation in private households equipped with smart meters is proposed. This method aims to classify power consumption as active or passive, granting the ability to report on the residents' activity and presence without direct interaction. This lays the foundation for applications like non-intrusive health monitoring of private homes. The proposed method is based on minimizing a suitable energy functional, for which the iPALM (inertial proximal alternating linearized minimization) algorithm is employed, demonstrating that various conditions guaranteeing convergence are satisfied. In order to confirm feasibility of the proposed method, experiments on semi-synthetic test data sets and a comparison to existing, supervised methods are provided.
90.0LGMay 18
Generating Physically Consistent Molecules with Energy-Based ModelsChristoph Griesbacher, Lea Bogensperger, Andreas Habring et al.
Molecules in equilibrium follow a Boltzmann distribution, making the underlying energy landscape a physically grounded modeling objective. However, such landscapes are difficult to learn from data and, once learned, hard to sample from. Diffusion and flow-matching models sidestep these difficulties by learning a time-conditional score or transport field between noise and data, losing the energy inductive bias in exchange for a more tractable training objective. We introduce EBMol, an energy-based model (EBM) that restores this inductive bias by learning an atom-additive scalar potential without explicit simulation during training. Our method employs a flow-inspired Restoring Field Matching objective to approximate the energy landscape. We adopt the Mirror-Langevin algorithm for sampling, enabling unified updates of atomic positions and types, and incorporate parallel tempering for inference-time compute scaling. EBMol is the first EBM for 3D molecular generation to achieve state-of-the-art performance on QM9 and GEOM-Drugs. Moreover, we show that the learned energy landscape serves as a principled quality metric for ranking and filtering configurations, and demonstrate controllable generation without retraining through shape-steered sampling via potential composition and zero-shot linker design.
56.8STMay 7
Time-Inhomogeneous Preconditioned Langevin DynamicsAlexander Falk, Laurenz Nagler, Andreas Habring et al.
Langevin sampling from distributions of the form $p(x) \propto \exp(-Ψ(x))$ faces two major challenges: (global) mode coverage and (local) mode exploration. The first challenge is particularly relevant for multi-modal distributions with disjoint modes, whereas the second arises when the potential $Ψ$ exhibits diverse and ill-conditioned local mode geometry. To address these challenges, a common approach is to precondition Langevin dynamics with problem-specific information, such as the sample covariance or the local curvature of $Ψ$. However, existing preconditioner choices inherently involve a trade-off between global mode coverage and local mode exploration, and no prior method resolves both simultaneously. To overcome this limitation, we propose the TIPreL, which introduces a time- and position-dependent preconditioner. This design effectively addresses both challenges mentioned above within a single framework. We establish convergence of the resulting dynamics in the Wasserstein-2 distance both in continuous time and for a tamed Euler discretization. In particular, our analysis extends the existing state of the art by proving convergence under time- and space-dependent diffusion coefficients, and only locally Lipschitz drifts, which has not been covered by prior work. Finally, we experimentally compare TIPreL with competing preconditioning schemes on a two-dimensional, severely ill-posed example and on a Bayesian logistic regression task in higher dimensions, confirming the efficiency of the proposed method.
IVMay 19, 2025
The Gaussian Latent Machine: Efficient Prior and Posterior Sampling for Inverse ProblemsMuhamed Kuric, Martin Zach, Andreas Habring et al.
We consider the problem of sampling from a product-of-experts-type model that encompasses many standard prior and posterior distributions commonly found in Bayesian imaging. We show that this model can be easily lifted into a novel latent variable model, which we refer to as a Gaussian latent machine. This leads to a general sampling approach that unifies and generalizes many existing sampling algorithms in the literature. Most notably, it yields a highly efficient and effective two-block Gibbs sampling approach in the general case, while also specializing to direct sampling algorithms in particular cases. Finally, we present detailed numerical experiments that demonstrate the efficiency and effectiveness of our proposed sampling approach across a wide range of prior and posterior sampling problems from Bayesian imaging.
OCFeb 3, 2025
Diffusion at Absolute Zero: Langevin Sampling Using Successive Moreau Envelopes [conference paper]Andreas Habring, Alexander Falk, Thomas Pock
In this article we propose a novel method for sampling from Gibbs distributions of the form $π(x)\propto\exp(-U(x))$ with a potential $U(x)$. In particular, inspired by diffusion models we propose to consider a sequence $(π^{t_k})_k$ of approximations of the target density, for which $π^{t_k}\approx π$ for $k$ small and, on the other hand, $π^{t_k}$ exhibits favorable properties for sampling for $k$ large. This sequence is obtained by replacing parts of the potential $U$ by its Moreau envelopes. Sampling is performed in an Annealed Langevin type procedure, that is, sequentially sampling from $π^{t_k}$ for decreasing $k$, effectively guiding the samples from a simple starting density to the more complex target. In addition to a theoretical analysis we show experimental results supporting the efficacy of the method in terms of increased convergence speed and applicability to multi-modal densities $π$.