MEMay 26
Bayesian optimal experimental design with Wasserstein information criteriaTapio Helin, Youssef Marzouk, Jose Rodrigo Rojo-Garcia
Bayesian optimal experimental design (OED) provides a principled framework for selecting observations or experiments. We introduce new Bayesian design criteria based on the expected Wasserstein-$p$ distance between the prior and posterior distributions, termed Wasserstein information criteria. These criteria have many parallels with the widely used expected information gain (EIG) criterion, which instead relies on the Kullback--Leibler divergence. We show that the Wasserstein-$2$ criterion admits a closed-form solution in the linear-Gaussian setting, a property which can be used for more general approximation schemes, and contrast this solution with classical notions of Bayesian alphabetic optimality. Then we develop a stability analysis of the Wasserstein-$1$ criterion, wherein we bound errors induced by perturbations of the prior or likelihood. We partially extend this analysis to the Wasserstein-$2$ criterion. In particular, these results yield error rates for empirical approximations of the prior. We then illustrate the computability of the Wasserstein-$2$ criterion and demonstrate our approximation rates through simulations.
NAMay 17, 2017
A Variational Reconstruction Method for Undersampled Dynamic X-ray Tomography based on Physical Motion ModelsMartin Burger, Hendrik Dirks, Lena Frerking et al.
In this paper we study the reconstruction of moving object densities from undersampled dynamic X-ray tomography in two dimensions. A particular motivation of this study is to use realistic measurement protocols for practical applications, i.e. we do not assume to have a full Radon transform in each time step, but only projections in few angular directions. This restriction enforces a space-time reconstruction, which we perform by incorporating physical motion models and regularization of motion vectors in a variational framework. The methodology of optical flow, which is one of the most common methods to estimate motion between two images, is utilized to formulate a joint variational model for reconstruction and motion estimation. We provide a basic mathematical analysis of the forward model and the variational model for the image reconstruction. Moreover, we discuss the efficient numerical minimization based on alternating minimizations between images and motion vectors. A variety of results are presented for simulated and real measurement data with different sampling strategy. A key observation is that random sampling combined with our model allows reconstructions of similar amount of measurements and quality as a single static reconstruction.
NAFeb 8, 2018
Large Noise in Variational RegularizationMartin Burger, Tapio Helin, Hanne Kekkonen
In this paper we consider variational regularization methods for inverse problems with large noise that is in general unbounded in the image space of the forward operator. We introduce a Banach space setting that allows to define a reasonable notion of solutions for more general noise in a larger space provided one has sufficient mapping properties of the forward operators. A key observation, which guides us through the subsequent analysis, is that such a general noise model can be understood with the same setting as approximate source conditions (while a standard model of bounded noise is related directly to classical source conditions). Based on this insight we obtain a quite general existence result for regularized variational problems and derive error estimates in terms of Bregman distances. The latter are specialized for the particularly important cases of one- and p-homogeneous regularization functionals. As a natural further step we study stochastic noise models and in particular white noise, for which we derive error estimates in terms of the expectation of the Bregman distance. The finiteness of certain expectations leads to a novel class of abstract smoothness conditions on the forward operator, which can be easily interpreted in the Hilbert space case. We finally exemplify the approach and in particular the conditions for popular examples of regularization functionals given by squared norm, Besov norm and total variation, respectively.
IMFeb 13, 2018
Atmospheric turbulence profiling with unknown power spectral densityTapio Helin, Stefan Kindermann, Jonatan Lehtonen et al.
Adaptive optics (AO) is a technology in modern ground-based optical telescopes to compensate the wavefront distortions caused by atmospheric turbulence. One method that allows to retrieve information about the atmosphere from telescope data is so-called SLODAR, where the atmospheric turbulence profile is estimated based on correlation data of Shack--Hartmann wavefront measurements. This approach relies on a layered Kolmogorov turbulence model. In this article, we propose a novel extension of the SLODAR concept by including a general non-Kolmogorov turbulence layer close to the ground with an unknown power spectral density. We prove that the joint estimation problem of the turbulence profile above ground simultaneously with the unknown power spectral density at the ground is ill-posed and propose three numerical reconstruction methods. We demonstrate by numerical simulations that our methods lead to substantial improvements in the turbulence profile reconstruction, compared to standard SLODAR-type approach. Also, our methods can accurately locate local perturbations in non-Kolmogorov power spectral densities.
NAFeb 15, 2013
Wavelet methods in multi-conjugate adaptive opticsTapio Helin, Mykhaylo Yudytskiy
The next generation ground-based telescopes rely heavily on adaptive optics for overcoming the limitation of atmospheric turbulence. In the future adaptive optics modalities, like multi-conjugate adaptive optics (MCAO), atmospheric tomography is the major mathematical and computational challenge. In this severely ill-posed problem a fast and stable reconstruction algorithm is needed that can take into account many real-life phenomena of telescope imaging. We introduce a novel reconstruction method for the atmospheric tomography problem and demonstrate its performance and flexibility in the context of MCAO. Our method is based on using locality properties of compactly supported wavelets, both in the spatial and frequency domain. The reconstruction in the atmospheric tomography problem is obtained by solving the Bayesian MAP estimator with a conjugate gradient based algorithm. An accelerated algorithm with preconditioning is also introduced. Numerical performance is demonstrated on the official end-to-end simulation tool OCTOPUS of European Southern Observatory.
NANov 23, 2015
Towards analytical model optimization in atmospheric tomographyTapio Helin, Stefan Kindermann, Daniela Saxenhuber
Modern ground-based telescopes rely on a technology called adaptive optics (AO) in order to compensate for the loss of image quality caused by atmospheric turbulence. Next-generation AO systems designed for a wide field of view require a stable and high-resolution reconstruction of the refractive index fluctuations in the atmosphere. By introducing a novel Bayesian method, we address the problem of estimating an atmospheric turbulence strength profile and reconstructing the refractive index fluctuations simultaneously, where we only use wavefront measurements of incoming light from guide stars. Most importantly, we demonstrate how this method can be used for model optimization as well. We propose two different algorithms for solving the maximum a posteriori estimate: the first approach is based on alternating minimization and has the advantage of integrability into existing atmospheric tomography methods. In the second approach, we formulate a convex non-differentiable optimization problem, which is solved by an iterative thresholding method. This approach clearly illustrates the underlying sparsity-enforcing mechanism for the strength profile. By introducing a tuning/regularization parameter, an automated model reduction of the layer structure of the atmosphere is achieved. Using numerical simulations, we demonstrate the performance of our method in practice.
MLFeb 3
Score-based diffusion models for diffuse optical tomography with uncertainty quantificationFabian Schneider, Meghdoot Mozumder, Konstantin Tamarov et al.
Score-based diffusion models are a recently developed framework for posterior sampling in Bayesian inverse problems with a state-of-the-art performance for severely ill-posed problems by leveraging a powerful prior distribution learned from empirical data. Despite generating significant interest especially in the machine-learning community, a thorough study of realistic inverse problems in the presence of modelling error and utilization of physical measurement data is still outstanding. In this work, the framework of unconditional representation for the conditional score function (UCoS) is evaluated for linearized difference imaging in diffuse optical tomography (DOT). DOT uses boundary measurements of near-infrared light to estimate the spatial distribution of absorption and scattering parameters in biological tissues. The problem is highly ill-posed and thus sensitive to noise and modelling errors. We introduce a novel regularization approach that prevents overfitting of the score function by constructing a mixed score composed of a learned and a model-based component. Validation of this approach is done using both simulated and experimental measurement data. The experiments demonstrate that a data-driven prior distribution results in posterior samples with low variance, compared to classical model-based estimation, and centred around the ground truth, even in the context of a highly ill-posed problem and in the presence of modelling errors.
IMDec 30, 2023Code
Laboratory Experiments of Model-based Reinforcement Learning for Adaptive Optics ControlJalo Nousiainen, Byron Engler, Markus Kasper et al.
Direct imaging of Earth-like exoplanets is one of the most prominent scientific drivers of the next generation of ground-based telescopes. Typically, Earth-like exoplanets are located at small angular separations from their host stars, making their detection difficult. Consequently, the adaptive optics (AO) system's control algorithm must be carefully designed to distinguish the exoplanet from the residual light produced by the host star. A new promising avenue of research to improve AO control builds on data-driven control methods such as Reinforcement Learning (RL). RL is an active branch of the machine learning research field, where control of a system is learned through interaction with the environment. Thus, RL can be seen as an automated approach to AO control, where its usage is entirely a turnkey operation. In particular, model-based reinforcement learning (MBRL) has been shown to cope with both temporal and misregistration errors. Similarly, it has been demonstrated to adapt to non-linear wavefront sensing while being efficient in training and execution. In this work, we implement and adapt an RL method called Policy Optimization for AO (PO4AO) to the GHOST test bench at ESO headquarters, where we demonstrate a strong performance of the method in a laboratory environment. Our implementation allows the training to be performed parallel to inference, which is crucial for on-sky operation. In particular, we study the predictive and self-calibrating aspects of the method. The new implementation on GHOST running PyTorch introduces only around 700 microseconds in addition to hardware, pipeline, and Python interface latency. We open-source well-documented code for the implementation and specify the requirements for the RTC pipeline. We also discuss the important hyperparameters of the method, the source of the latency, and the possible paths for a lower latency implementation.
NAApr 29
Robust Model-Based Iteration for Passive Gamma Emission TomographyTommi Heikkilä, Sara Heikkinen, Riina Rimppi et al.
Passive Gamma Emission Tomography (PGET) is an IAEA-approved technique for verifying spent nuclear fuel assemblies prior to geological disposal. Reconstructing the emission and attenuation maps from PGET measurements is a nonlinear ill-posed inverse problem, currently solved with a Levenberg-Marquardt (LM) scheme that requires 10-20 iterations to achieve sufficient accuracy. We propose an accelerated iterative solver that combines the LM algorithm with a Deep Gauss-Newton step, in which a learned operator refines the update proposed by the deterministic algorithm at each iteration. A safeguard condition based on the trust-region model ensures that the accelerated iterates perform no worse than LM and retain convergence to a critical point of the regularized objective. Within this framework we compare three architectures for the learned component: an encoder-decoder-style convolutional neural network, Fourier Neural Operators, and Wavelet Neural Operators. Each is trained on a small set of coarsely simulated 9x9 assemblies. Experiments on simulated and real measurements from Finnish nuclear power plants show that the proposed scheme reaches LM-quality reconstructions in roughly one third of the iterations, while revealing architecture-dependent trade-offs in robustness against out-of-distribution inputs.
MLOct 1, 2025
Approximation of differential entropy in Bayesian optimal experimental designChuntao Chen, Tapio Helin, Nuutti Hyvönen et al.
Bayesian optimal experimental design provides a principled framework for selecting experimental settings that maximize obtained information. In this work, we focus on estimating the expected information gain in the setting where the differential entropy of the likelihood is either independent of the design or can be evaluated explicitly. This reduces the problem to maximum entropy estimation, alleviating several challenges inherent in expected information gain computation. Our study is motivated by large-scale inference problems, such as inverse problems, where the computational cost is dominated by expensive likelihood evaluations. We propose a computational approach in which the evidence density is approximated by a Monte Carlo or quasi-Monte Carlo surrogate, while the differential entropy is evaluated using standard methods without additional likelihood evaluations. We prove that this strategy achieves convergence rates that are comparable to, or better than, state-of-the-art methods for full expected information gain estimation, particularly when the cost of entropy evaluation is negligible. Moreover, our approach relies only on mild smoothness of the forward map and avoids stronger technical assumptions required in earlier work. We also present numerical experiments, which confirm our theoretical findings.
MLDec 21, 2024
Gradient-Based Non-Linear Inverse LearningAbhishake, Nicole Mücke, Tapio Helin
We study statistical inverse learning in the context of nonlinear inverse problems under random design. Specifically, we address a class of nonlinear problems by employing gradient descent (GD) and stochastic gradient descent (SGD) with mini-batching, both using constant step sizes. Our analysis derives convergence rates for both algorithms under classical a priori assumptions on the smoothness of the target function. These assumptions are expressed in terms of the integral operator associated with the tangent kernel, as well as through a bound on the effective dimension. Additionally, we establish stopping times that yield minimax-optimal convergence rates within the classical reproducing kernel Hilbert space (RKHS) framework. These results demonstrate the efficacy of GD and SGD in achieving optimal rates for nonlinear inverse problems in random design.
MLDec 20, 2024
Learning sparsity-promoting regularizers for linear inverse problemsGiovanni S. Alberti, Ernesto De Vito, Tapio Helin et al.
This paper introduces a novel approach to learning sparsity-promoting regularizers for solving linear inverse problems. We develop a bilevel optimization framework to select an optimal synthesis operator, denoted as $B$, which regularizes the inverse problem while promoting sparsity in the solution. The method leverages statistical properties of the underlying data and incorporates prior knowledge through the choice of $B$. We establish the well-posedness of the optimization problem, provide theoretical guarantees for the learning process, and present sample complexity bounds. The approach is demonstrated through examples, including compact perturbations of a known operator and the problem of learning the mother wavelet, showcasing its flexibility in incorporating prior knowledge into the regularization framework. This work extends previous efforts in Tikhonov regularization by addressing non-differentiable norms and proposing a data-driven approach for sparse regularization in infinite dimensions.
MLMay 24, 2024
An Unconditional Representation of the Conditional Score in Infinite-Dimensional Linear Inverse ProblemsFabian Schneider, Duc-Lam Duong, Matti Lassas et al.
Score-based diffusion models (SDMs) have emerged as a powerful tool for sampling from the posterior distribution in Bayesian inverse problems. However, existing methods often require multiple evaluations of the forward mapping to generate a single sample, resulting in significant computational costs for large-scale inverse problems. To address this, we propose an unconditional representation of the conditional score function (UCoS) tailored to linear inverse problems, which avoids forward model evaluations during sampling by shifting computational effort to an offline training phase. In this phase, a \emph{task-dependent} score function is learned based on the linear forward operator. Crucially, we show that the conditional score can be derived \emph{exactly} from a trained (unconditional) score using affine transformations, eliminating the need for conditional score approximations. Our approach is formulated in infinite-dimensional function spaces, making it inherently discretization-invariant. We support this formulation with a rigorous convergence analysis that justifies UCoS beyond any specific discretization. Finally we validate UCoS through high-dimensional computed tomography (CT) and image deblurring experiments, demonstrating both scalability and accuracy.
MLFeb 18, 2021
Convex regularization in statistical inverse learning problemsTatiana A. Bubba, Martin Burger, Tapio Helin et al.
We consider a statistical inverse learning problem, where the task is to estimate a function $f$ based on noisy point evaluations of $Af$, where $A$ is a linear operator. The function $Af$ is evaluated at i.i.d. random design points $u_n$, $n=1,...,N$ generated by an unknown general probability distribution. We consider Tikhonov regularization with general convex and $p$-homogeneous penalty functionals and derive concentration rates of the regularized solution to the ground truth measured in the symmetric Bregman distance induced by the penalty functional. We derive concrete rates for Besov norm penalties and numerically demonstrate the correspondence with the observed rates in the context of X-ray tomography.
STMay 10, 2019
Hyperparameter Estimation in Bayesian MAP Estimation: Parameterizations and ConsistencyMatthew M. Dunlop, Tapio Helin, Andrew M. Stuart
The Bayesian formulation of inverse problems is attractive for three primary reasons: it provides a clear modelling framework; means for uncertainty quantification; and it allows for principled learning of hyperparameters. The posterior distribution may be explored by sampling methods, but for many problems it is computationally infeasible to do so. In this situation maximum a posteriori (MAP) estimators are often sought. Whilst these are relatively cheap to compute, and have an attractive variational formulation, a key drawback is their lack of invariance under change of parameterization. This is a particularly significant issue when hierarchical priors are employed to learn hyperparameters. In this paper we study the effect of the choice of parameterization on MAP estimators when a conditionally Gaussian hierarchical prior distribution is employed. Specifically we consider the centred parameterization, the natural parameterization in which the unknown state is solved for directly, and the noncentred parameterization, which works with a whitened Gaussian as the unknown state variable, and arises when considering dimension-robust MCMC algorithms; MAP estimation is well-defined in the nonparametric setting only for the noncentred parameterization. However, we show that MAP estimates based on the noncentred parameterization are not consistent as estimators of hyperparameters; conversely, we show that limits of finite-dimensional centred MAP estimators are consistent as the dimension tends to infinity. We also consider empirical Bayesian hyperparameter estimation, show consistency of these estimates, and demonstrate that they are more robust with respect to noise than centred MAP estimates. An underpinning concept throughout is that hyperparameters may only be recovered up to measure equivalence, a well-known phenomenon in the context of the Ornstein-Uhlenbeck process.