ITMay 20
On the reconstruction of bandlimited signals from random samples quantized via noise-shapingRohan Joy, Felix Krahmer, Alessandro Lupoli et al.
Noise-shaping quantization techniques are widely used for converting bandlimited signals from the analog to the digital domain. They work by ``shaping" the quantization noise so that it falls close to the reconstruction operator's null space. We investigate the compatibility of two such schemes, specifically $ΣΔ$ quantization and distributed noise-shaping quantization, with random samples of bandlimited functions. Suppose $R>1$ is a real number and assume that $\{x_i\}_{i=1}^m$ is a sequence of i.i.d random variables uniformly distributed on $[-\tilde{R},\tilde{R}]$, where $\tilde{R}>R$ is appropriately chosen. We show that by using a noise-shaping quantizer to quantize the (randomly sign flipped) values of a real-valued $π$-bandlimited function $f$ at $\{x_i\}_{i=1}^m$, a function $f^{\sharp}$ can be reconstructed from these quantized values such that $\|f-f^{\sharp}\|_{L^2[-R, R]}$ decays with high probability as $m$ and $\tilde{R}$ increase. This decay holds uniformly over all bandlimited $f$. We emphasize that the sample points $\{x_i\}_{i=1}^m$ are completely random, that is, they have no predefined structure, which makes our findings the first of their kind.
ITFeb 11, 2011
New and improved Johnson-Lindenstrauss embeddings via the Restricted Isometry PropertyFelix Krahmer, Rachel Ward
Consider an m by N matrix Phi with the Restricted Isometry Property of order k and level delta, that is, the norm of any k-sparse vector in R^N is preserved to within a multiplicative factor of 1 +- delta under application of Phi. We show that by randomizing the column signs of such a matrix Phi, the resulting map with high probability embeds any fixed set of p = O(e^k) points in R^N into R^m without distorting the norm of any point in the set by more than a factor of 1 +- delta. Consequently, matrices with the Restricted Isometry Property and with randomized column signs provide optimal Johnson-Lindenstrauss embeddings up to logarithmic factors in N. In particular, our results improve the best known bounds on the necessary embedding dimension m for a wide class of structured random matrices; for partial Fourier and partial Hadamard matrices, we improve the recent bound m = O(delta^(-4) log(p) log^4(N)) appearing in Ailon and Liberty to m = O(delta^(-2) log(p) log^4(N)), which is optimal up to the logarithmic factors in N. Our results also have a direct application in the area of compressed sensing for redundant dictionaries.
NAJun 28, 2011
Optimally Sparse FramesPeter G. Casazza, Andreas Heinecke, Felix Krahmer et al.
Frames have established themselves as a means to derive redundant, yet stable decompositions of a signal for analysis or transmission, while also promoting sparse expansions. However, when the signal dimension is large, the computation of the frame measurements of a signal typically requires a large number of additions and multiplications, and this makes a frame decomposition intractable in applications with limited computing budget. To address this problem, in this paper, we focus on frames in finite-dimensional Hilbert spaces and introduce sparsity for such frames as a new paradigm. In our terminology, a sparse frame is a frame whose elements have a sparse representation in an orthonormal basis, thereby enabling low-complexity frame decompositions. To introduce a precise meaning of optimality, we take the sum of the numbers of vectors needed of this orthonormal basis when expanding each frame vector as sparsity measure. We then analyze the recently introduced algorithm Spectral Tetris for construction of unit norm tight frames and prove that the tight frames generated by this algorithm are in fact optimally sparse with respect to the standard unit vector basis. Finally, we show that even the generalization of Spectral Tetris for the construction of unit norm frames associated with a given frame operator produces optimally sparse frames.
ITMay 5
Fast One-Pass Sparse Approximation of the Top Eigenvectors of Huge Approximately Low-Rank Matrices? Yes, $MAM^*$!Edem Boahen, Simone Brugiapaglia, Hung-Hsu Chou et al.
Motivated by applications such as sparse PCA, in this paper we present provably-accurate one-pass algorithms for the sparse approximation of the top eigenvectors of extremely massive matrices based on a single compact linear sketch. The resulting compressive-sensing-based approaches can approximate the leading eigenvectors of huge approximately low-rank matrices that are too large to store in memory based on a single pass over its entries while utilizing a total memory footprint on the order of the much smaller desired sparse eigenvector approximations. Finally, the compressive sensing recovery algorithm itself (which takes the gathered compressive matrix measurements as input, and then outputs sparse approximations of its top eigenvectors) can also be formulated to run in a time which principally depends on the size of the sought sparse approximations, making its runtime sublinear in the size of the large matrix whose eigenvectors one aims to approximate. Preliminary experiments on huge matrices having $\sim 10^{16}$ entries illustrate the developed theory and demonstrate the practical potential of the proposed approach.
SPJul 18, 2024
High-Dimensional Confidence Regions in Sparse MRIFrederik Hoppe, Felix Krahmer, Claudio Mayrink Verdun et al.
One of the most promising solutions for uncertainty quantification in high-dimensional statistics is the debiased LASSO that relies on unconstrained $\ell_1$-minimization. The initial works focused on real Gaussian designs as a toy model for this problem. However, in medical imaging applications, such as compressive sensing for MRI, the measurement system is represented by a (subsampled) complex Fourier matrix. The purpose of this work is to extend the method to the MRI case in order to construct confidence intervals for each pixel of an MR image. We show that a sufficient amount of data is $n \gtrsim \max\{ s_0\log^2 s_0\log p, s_0 \log^2 p \}$.
SINov 12, 2025Code
Conformal Prediction for Multi-Source Detection on a NetworkXingchao Jian, Purui Zhang, Lan Tian et al.
Detecting the origin of information or infection spread in networks is a fundamental challenge with applications in misinformation tracking, epidemiology, and beyond. We study the multi-source detection problem: given snapshot observations of node infection status on a graph, estimate the set of source nodes that initiated the propagation. Existing methods either lack statistical guarantees or are limited to specific diffusion models and assumptions. We propose a novel conformal prediction framework that provides statistically valid recall guarantees for source set detection, independent of the underlying diffusion process or data distribution. Our approach introduces principled score functions to quantify the alignment between predicted probabilities and true sources, and leverages a calibration set to construct prediction sets with user-specified recall and coverage levels. The method is applicable to both single- and multi-source scenarios, supports general network diffusion dynamics, and is computationally efficient for large graphs. Empirical results demonstrate that our method achieves rigorous coverage with competitive accuracy, outperforming existing baselines in both reliability and scalability.The code is available online.
MLSep 14, 2023
Uncertainty quantification for learned ISTAFrederik Hoppe, Claudio Mayrink Verdun, Felix Krahmer et al.
Model-based deep learning solutions to inverse problems have attracted increasing attention in recent years as they bridge state-of-the-art numerical performance with interpretability. In addition, the incorporated prior domain knowledge can make the training more efficient as the smaller number of parameters allows the training step to be executed with smaller datasets. Algorithm unrolling schemes stand out among these model-based learning techniques. Despite their rapid advancement and their close connection to traditional high-dimensional statistical methods, they lack certainty estimates and a theory for uncertainty quantification is still elusive. This work provides a step towards closing this gap proposing a rigorous way to obtain confidence intervals for the LISTA estimator.
LGAug 5, 2023
Approximating Positive Homogeneous Functions with Scale Invariant Neural NetworksStefan Bamberger, Reinhard Heckel, Felix Krahmer
We investigate to what extent it is possible to solve linear inverse problems with $ReLu$ networks. Due to the scaling invariance arising from the linearity, an optimal reconstruction function $f$ for such a problem is positive homogeneous, i.e., satisfies $f(λx) = λf(x)$ for all non-negative $λ$. In a $ReLu$ network, this condition translates to considering networks without bias terms. We first consider recovery of sparse vectors from few linear measurements. We prove that $ReLu$- networks with only one hidden layer cannot even recover $1$-sparse vectors, not even approximately, and regardless of the width of the network. However, with two hidden layers, approximate recovery with arbitrary precision and arbitrary sparsity level $s$ is possible in a stable way. We then extend our results to a wider class of recovery problems including low-rank matrix recovery and phase retrieval. Furthermore, we also consider the approximation of general positive homogeneous functions with neural networks. Extending previous work, we establish new results explaining under which conditions such functions can be approximated with neural networks. Our results also shed some light on the seeming contradiction between previous works showing that neural networks for inverse problems typically have very large Lipschitz constants, but still perform very well also for adversarial noise. Namely, the error bounds in our expressivity results include a combination of a small constant term and a term that is linear in the noise level, indicating that robustness issues may occur only for very small noise levels.
ITMar 17, 2023
How robust is randomized blind deconvolution via nuclear norm minimization against adversarial noise?Julia Kostin, Felix Krahmer, Dominik Stöger
In this paper, we study the problem of recovering two unknown signals from their convolution, which is commonly referred to as blind deconvolution. Reformulation of blind deconvolution as a low-rank recovery problem has led to multiple theoretical recovery guarantees in the past decade due to the success of the nuclear norm minimization heuristic. In particular, in the absence of noise, exact recovery has been established for sufficiently incoherent signals contained in lower-dimensional subspaces. However, if the convolution is corrupted by additive bounded noise, the stability of the recovery problem remains much less understood. In particular, existing reconstruction bounds involve large dimension factors and therefore fail to explain the empirical evidence for dimension-independent robustness of nuclear norm minimization. Recently, theoretical evidence has emerged for ill-posed behavior of low-rank matrix recovery for sufficiently small noise levels. In this work, we develop improved recovery guarantees for blind deconvolution with adversarial noise which exhibit square-root scaling in the noise level. Hence, our results are consistent with existing counterexamples which speak against linear scaling in the noise level as demonstrated for related low-rank matrix recovery problems.
LGMay 21
Plug-in Losses for Evidential Deep Learning: A Simplified Framework for Uncertainty Estimation that Includes the Softmax ClassifierBerk Hayta, Hannah Laus, Simon Mittermaier et al.
Real-world sensor-based learning systems require uncertainty estimation that is both reliable and computationally efficient. Evidential Deep Learning (EDL) provides single-pass uncertainty estimation by modeling the class probabilities via Dirichlet distributions, where the Dirichlet parameters are predicted by a learned neural network mapping. However, this approach can lead to computational challenges, as Dirichlet expected objectives are more complex than standard supervised learning losses, complicating their analysis and implementation. We address this issue by approximating the objective of the first-order empirical risk minimization problem induced by EDL with a plug-in loss evaluated at the Dirichlet mean and show that, under mild assumptions, the approximation error decays with growing evidence for a broad class of loss functions, including mean-squared error and cross-entropy loss. As a special case, our analysis provides justification for the use of softmax in the context of uncertainty estimation, since under a particular evidence-to-Dirichlet mapping, our framework includes the standard softmax classifier. We validate the proposed simplified objectives on the Google Speech Commands dataset and show that they achieve predictive accuracy and selective prediction performance comparable to classical EDL, while being simpler to implement using standard deep learning losses and training pipelines. To the best of our knowledge, this empirical analysis is the first to obtain coverage-accuracy trade-offs for speech recognition tasks through EDL.
LGDec 30, 2022
Non-intrusive surrogate modelling using sparse random features with applications in crashworthiness analysisMaternus Herold, Anna Veselovska, Jonas Jehle et al.
Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised dimensionality reduction is described. The method is compared to other methods on synthetic and real data obtained from crashworthiness analyses. The results show a superiority of the here described approach over state of the art surrogate modelling techniques, Polynomial Chaos Expansions and Neural Networks.
ITMay 16
On Trajectory-Based Stability Analysis for $1$-bit Sigma-Delta Quantization and its Application to the Second-Order CaseRohan Joy, Felix Krahmer, Alessandro Lupoli
A state-of-the-art strategy for digitally representing a bandlimited signal $f$ is $ΣΔ$ quantization. $ΣΔ$ quantization schemes choose a bit sequence $(q_n)$ representing the samples $(y_n)$ of $f$ sequentially based on a state sequence $(u_n)$ defined via a recurrence relation of the form \begin{equation*} u_n = (h*u)_n + y_n - q_n, \end{equation*} where $h_j = 0$ for $j\le 0.$ The effectiveness of a quantization scheme crucially depends on the fact that it is stable, i.e. , the state variable remains uniformly bounded in a given class of signals. Thus, a common strategy is to choose $$q_n = \operatorname{sign}((h*u)_n + y_n).$$ It is well known that a sufficient condition for this quantization rule to induce stability is that $$ \|h\|_{\ell^1}+\|f\|_{\infty}\le 2.$$ At the same time, one empirically observes that this condition is conservative and stability holds significantly beyond this bound. In this paper, we address this gap by establishing the first stability guarantees beyond first order that outperform the $\ell^1$ based stability condition. In contrast to many previous approaches, our analysis describes the trajectories of the state variables rather than characterizing the invariant set, an approach that had previously been performed only in some specific example cases. This viewpoint has the main advantage that it makes it possible to treat longer filters, which are difficult to handle through invariant-set analysis because of the resulting high dimensionality. We apply our technique to second-order $ΣΔ$ schemes with sparse feedback filters as proposed by Günturk \cite{gunturk2003one}, showing that the filter length required to guarantee stability significantly improves from the length $O\left(\frac{1}{1-\|f\|_{\infty}}\right)$ needed to apply the $\ell^1$ based criterion to $O\left(\frac{1}{\sqrt{1-\|f\|_{\infty}}}\right)$.
LGJul 18, 2024
Non-Asymptotic Uncertainty Quantification in High-Dimensional LearningFrederik Hoppe, Claudio Mayrink Verdun, Hannah Laus et al.
Uncertainty quantification (UQ) is a crucial but challenging task in many high-dimensional regression or learning problems to increase the confidence of a given predictor. We develop a new data-driven approach for UQ in regression that applies both to classical regression approaches such as the LASSO as well as to neural networks. One of the most notable UQ techniques is the debiased LASSO, which modifies the LASSO to allow for the construction of asymptotic confidence intervals by decomposing the estimation error into a Gaussian and an asymptotically vanishing bias component. However, in real-world problems with finite-dimensional data, the bias term is often too significant to be neglected, resulting in overly narrow confidence intervals. Our work rigorously addresses this issue and derives a data-driven adjustment that corrects the confidence intervals for a large class of predictors by estimating the means and variances of the bias terms from training data, exploiting high-dimensional concentration phenomena. This gives rise to non-asymptotic confidence intervals, which can help avoid overestimating uncertainty in critical applications such as MRI diagnosis. Importantly, our analysis extends beyond sparse regression to data-driven predictors like neural networks, enhancing the reliability of model-based deep learning. Our findings bridge the gap between established theory and the practical applicability of such debiased methods.
SPJul 18, 2024
With or Without Replacement? Improving Confidence in Fourier ImagingFrederik Hoppe, Claudio Mayrink Verdun, Felix Krahmer et al.
Over the last few years, debiased estimators have been proposed in order to establish rigorous confidence intervals for high-dimensional problems in machine learning and data science. The core argument is that the error of these estimators with respect to the ground truth can be expressed as a Gaussian variable plus a remainder term that vanishes as long as the dimension of the problem is sufficiently high. Thus, uncertainty quantification (UQ) can be performed exploiting the Gaussian model. Empirically, however, the remainder term cannot be neglected in many realistic situations of moderately-sized dimensions, in particular in certain structured measurement scenarios such as Magnetic Resonance Imaging (MRI). This, in turn, can downgrade the advantage of the UQ methods as compared to non-UQ approaches such as the standard LASSO. In this paper, we present a method to improve the debiased estimator by sampling without replacement. Our approach leverages recent results of ours on the structure of the random nature of certain sampling schemes showing how a transition between sampling with and without replacement can lead to a weighted reconstruction scheme with improved performance for the standard LASSO. In this paper, we illustrate how this reweighted sampling idea can also improve the debiased estimator and, consequently, provide a better method for UQ in Fourier imaging.
ITMay 10, 2021Code
The Modulo Radon Transform: Theory, Algorithms and ApplicationsMatthias Beckmann, Ayush Bhandari, Felix Krahmer
Recently, experiments have been reported where researchers were able to perform high dynamic range (HDR) tomography in a heuristic fashion, by fusing multiple tomographic projections. This approach to HDR tomography has been inspired by HDR photography and inherits the same disadvantages. Taking a computational imaging approach to the HDR tomography problem, we here suggest a new model based on the Modulo Radon Transform (MRT), which we rigorously introduce and analyze. By harnessing a joint design between hardware and algorithms, we present a single-shot HDR tomography approach, which to our knowledge, is the only approach that is backed by mathematical guarantees. On the hardware front, instead of recording the Radon Transform projections that my potentially saturate, we propose to measure modulo values of the same. This ensures that the HDR measurements are folded into a lower dynamic range. On the algorithmic front, our recovery algorithms reconstruct the HDR images from folded measurements. Beyond mathematical aspects such as injectivity and inversion of the MRT for different scenarios including band-limited and approximately compactly supported images, we also provide a first proof-of-concept demonstration. To do so, we implement MRT by experimentally folding tomographic measurements available as an open source data set using our custom designed modulo hardware. Our reconstruction clearly shows the advantages of our approach for experimental data. In this way, our MRT based solution paves a path for HDR acquisition in a number of related imaging problems.
LGFeb 21, 2025
Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight DecayHannah Laus, Suzanna Parkinson, Vasileios Charisopoulos et al.
Machine learning methods are commonly used to solve inverse problems, wherein an unknown signal must be estimated from few measurements generated via a known acquisition procedure. In particular, neural networks perform well empirically but have limited theoretical guarantees. In this work, we study an underdetermined linear inverse problem that admits several possible solution mappings. A standard remedy (e.g., in compressed sensing) establishing uniqueness of the solution mapping is to assume knowledge of latent low-dimensional structure in the source signal. We ask the following question: do deep neural networks adapt to this low-dimensional structure when trained by gradient descent with weight decay regularization? We prove that mildly overparameterized deep linear networks trained in this manner converge to an approximate solution that accurately solves the inverse problem while implicitly encoding latent subspace structure. To our knowledge, this is the first result to rigorously show that deep linear networks trained with weight decay automatically adapt to latent subspace structure in the data under practical stepsize and weight initialization schemes. Our work highlights that regularization and overparameterization improve generalization, while overparameterization also accelerates convergence during training.
LGOct 21, 2024
Implicit Regularization for Tubal Tensor Factorizations via Gradient DescentSanthosh Karnik, Anna Veselovska, Mark Iwen et al.
We provide a rigorous analysis of implicit regularization in an overparametrized tensor factorization problem beyond the lazy training regime. For matrix factorization problems, this phenomenon has been studied in a number of works. A particular challenge has been to design universal initialization strategies which provably lead to implicit regularization in gradient-descent methods. At the same time, it has been argued by Cohen et. al. 2016 that more general classes of neural networks can be captured by considering tensor factorizations. However, in the tensor case, implicit regularization has only been rigorously established for gradient flow or in the lazy training regime. In this paper, we prove the first tensor result of its kind for gradient descent rather than gradient flow. We focus on the tubal tensor product and the associated notion of low tubal rank, encouraged by the relevance of this model for image data. We establish that gradient descent in an overparametrized tensor factorization model with a small random initialization exhibits an implicit bias towards solutions of low tubal rank. Our theoretical findings are illustrated in an extensive set of numerical simulations show-casing the dynamics predicted by our theory as well as the crucial role of using a small random initialization.
ITFeb 28, 2019
On the convex geometry of blind deconvolution and matrix completionFelix Krahmer, Dominik Stöger
Low-rank matrix recovery from structured measurements has been a topic of intense study in the last decade and many important problems like matrix completion and blind deconvolution have been formulated in this framework. An important benchmark method to solve these problems is to minimize the nuclear norm, a convex proxy for the rank. A common approach to establish recovery guarantees for this convex program relies on the construction of a so-called approximate dual certificate. However, this approach provides only limited insight in various respects. Most prominently, the noise bounds exhibit seemingly suboptimal dimension factors. In this paper we take a novel, more geometric viewpoint to analyze both the matrix completion and the blind deconvolution scenario. We find that for both these applications the dimension factors in the noise bounds are not an artifact of the proof, but the problems are intrinsically badly conditioned. We show, however, that bad conditioning only arises for very small noise levels: Under mild assumptions that include many realistic noise levels we derive near-optimal error estimates for blind deconvolution under adversarial noise.
CVJun 21, 2018
Are good local minima wide in sparse recovery?Michael Moeller, Otmar Loffeld, Juergen Gall et al.
The idea of compressed sensing is to exploit representations in suitable (overcomplete) dictionaries that allow to recover signals far beyond the Nyquist rate provided that they admit a sparse representation in the respective dictionary. The latter gives rise to the sparse recovery problem of finding the best sparse linear approximation of given data in a given generating system. In this paper we analyze the iterative hard thresholding (IHT) algorithm as one of the most popular greedy methods for solving the sparse recovery problem, and demonstrate that systematically perturbing the IHT algorithm by adding noise to intermediate iterates yields improved results. Further improvements can be obtained by entirely rephrasing the problem as a parametric deep-learning-type of optimization problem. By introducing perturbations via dropout, we demonstrate to significantly outperform the classical IHT algorithm, obtaining $3$ to $6$ times lower average objective errors.
CVOct 8, 2012
Stable and robust sampling strategies for compressive imagingFelix Krahmer, Rachel Ward
In many signal processing applications, one wishes to acquire images that are sparse in transform domains such as spatial finite differences or wavelets using frequency domain samples. For such applications, overwhelming empirical evidence suggests that superior image reconstruction can be obtained through variable density sampling strategies that concentrate on lower frequencies. The wavelet and Fourier transform domains are not incoherent because low-order wavelets and low-order frequencies are correlated, so compressive sensing theory does not immediately imply sampling strategies and reconstruction guarantees. In this paper we turn to a more refined notion of coherence -- the so-called local coherence -- measuring for each sensing vector separately how correlated it is to the sparsity basis. For Fourier measurements and Haar wavelet sparsity, the local coherence can be controlled and bounded explicitly, so for matrices comprised of frequencies sampled from a suitable inverse square power-law density, we can prove the restricted isometry property with near-optimal embedding dimensions. Consequently, the variable-density sampling strategy we provide allows for image reconstructions that are stable to sparsity defects and robust to measurement noise. Our results cover both reconstruction by $\ell_1$-minimization and by total variation minimization. The local coherence framework developed in this paper should be of independent interest in sparse recovery problems more generally, as it implies that for optimal sparse recovery results, it suffices to have bounded \emph{average} coherence from sensing basis to sparsity basis -- as opposed to bounded maximal coherence -- as long as the sampling strategy is adapted accordingly.