Anders C. Hansen

LG
h-index7
17papers
971citations
Novelty62%
AI Score47

17 Papers

ITJun 23, 2014
Breaking the coherence barrier: A new theory for compressed sensing

Ben Adcock, Anders C. Hansen, Clarice Poon et al.

This paper provides an extension of compressed sensing which bridges a substantial gap between existing theory and its current use in real-world applications. It introduces a mathematical framework that generalizes the three standard pillars of compressed sensing - namely, sparsity, incoherence and uniform random subsampling - to three new concepts: asymptotic sparsity, asymptotic incoherence and multilevel random sampling. The new theorems show that compressed sensing is also possible, and reveals several advantages, under these substantially relaxed conditions. The importance of this is threefold. First, inverse problems to which compressed sensing is currently applied are typically coherent. The new theory provides the first comprehensive mathematical explanation for a range of empirical usages of compressed sensing in real-world applications, such as medical imaging, microscopy, spectroscopy and others. Second, in showing that compressed sensing does not require incoherence, but instead that asymptotic incoherence is sufficient, the new theory offers markedly greater flexibility in the design of sensing mechanisms. Third, by using asymptotic incoherence and multi-level sampling to exploit not just sparsity, but also structure, i.e. asymptotic sparsity, the new theory shows that substantially improved reconstructions can be obtained from fewer measurements.

NANov 30, 2010
A Generalized Sampling Theorem for Stable Reconstructions in Arbitrary Bases

Ben Adcock, Anders C. Hansen

We introduce a generalized framework for sampling and reconstruction in separable Hilbert spaces. Specifically, we establish that it is always possible to stably reconstruct a vector in an arbitrary Riesz basis from sufficiently many of its samples in any other Riesz basis. This framework can be viewed as an extension of that of Eldar et al. However, whilst the latter imposes stringent assumptions on the reconstruction basis, and may in practice be unstable, our framework allows for recovery in any (Riesz) basis in a manner that is completely stable. Whilst the classical Shannon Sampling Theorem is a special case of our theorem, this framework allows us to exploit additional information about the approximated vector (or, in this case, function), for example sparsity or regularity, to design a reconstruction basis that is better suited. Examples are presented illustrating this procedure.

NAJan 13, 2013
Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem

Ben Adcock, Anders C. Hansen, Clarice Poon

Generalized sampling is a recently developed linear framework for sampling and reconstruction in separable Hilbert spaces. It allows one to recover any element in any finite-dimensional subspace given finitely many of its samples with respect to an arbitrary frame. Unlike more common approaches for this problem, such as the consistent reconstruction technique of Eldar et al, it leads to completely stable numerical methods possessing both guaranteed stability and accuracy. The purpose of this paper is twofold. First, we give a complete and formal analysis of generalized sampling, the main result of which being the derivation of new, sharp bounds for the accuracy and stability of this approach. Such bounds improve those given previously, and result in a necessary and sufficient condition, the stable sampling rate, which guarantees a priori a good reconstruction. Second, we address the topic of optimality. Under some assumptions, we show that generalized sampling is an optimal, stable reconstruction. Correspondingly, whenever these assumptions hold, the stable sampling rate is a universal quantity. In the final part of the paper we illustrate our results by applying generalized sampling to the so-called uniform resampling problem.

NANov 30, 2010
Stable reconstructions in Hilbert spaces and the resolution of the Gibbs phenomenon

Ben Adcock, Anders C. Hansen

We introduce a method to reconstruct an element of a Hilbert space in terms of an arbitrary finite collection of linearly independent reconstruction vectors, given a finite number of its samples with respect to any Riesz basis. As we establish, provided the dimension of the reconstruction space is chosen suitably in relation to the number of samples, this procedure can be numerically implemented in a stable manner. Moreover, the accuracy of the resulting approximation is completely determined by the choice of reconstruction basis, meaning that the reconstruction vectors can be tailored to the particular problem at hand. An important example of this approach is the accurate recovery of a piecewise analytic function from its first few Fourier coefficients. Whilst the standard Fourier projection suffers from the Gibbs phenomenon, by reconstructing in a piecewise polynomial basis, we obtain an approximation with root exponential accuracy in terms of the number of Fourier samples and exponential accuracy in terms of the degree of the reconstruction function. Numerical examples illustrate the advantage of this approach over other existing methods.

NAMay 12, 2013
On optimal wavelet reconstructions from Fourier samples: linearity and universality of the stable sampling rate

Ben Adcock, Anders C. Hansen, Clarice Poon

In this paper we study the problem of computing wavelet coefficients of compactly supported functions from their Fourier samples. For this, we use the recently introduced framework of generalized sampling. Our first result demonstrates that using generalized sampling one obtains a stable and accurate reconstruction, provided the number of Fourier samples grows linearly in the number of wavelet coefficients recovered. For the class of Daubechies wavelets we derive the exact constant of proportionality. Our second result concerns the optimality of generalized sampling for this problem. Under some mild assumptions we show that generalized sampling cannot be outperformed in terms of approximation quality by more than a constant factor. Moreover, for the class of so-called perfect methods, any attempt to lower the sampling ratio below a certain critical threshold necessarily results in exponential ill-conditioning. Thus generalized sampling provides a nearly-optimal solution to this problem.

NAJan 31, 2013
A stability barrier for reconstructions from Fourier samples

Ben Adcock, Anders C. Hansen, Alexei Shadrin

We prove that any stable method for resolving the Gibbs phenomenon - that is, recovering high-order accuracy from the first $m$ Fourier coefficients of an analytic and nonperiodic function - can converge at best root-exponentially fast in $m$. Any method with faster convergence must also be unstable, and in particular, exponential convergence implies exponential ill-conditioning. This result is analogous to a recent theorem of Platte, Trefethen & Kuijlaars concerning recovery from pointwise function values on an equispaced $m$-grid. The main step in our proof is an estimate for the maximal behaviour of a polynomial of degree $n$ with bounded $m$-term Fourier series, which is related to a conjecture of Hrycak & Groechenig. In the second part of the paper we discuss the implications of our main theorem to polynomial-based interpolation and least-squares approaches for overcoming the Gibbs phenomenon. Finally, we consider the use of so-called Fourier extensions as an attractive alternative for this problem. We present numerical results demonstrating rapid convergence in a stable manner.

AIAug 5, 2024
On the consistent reasoning paradox of intelligence and optimal trust in AI: The power of 'I don't know'

Alexander Bastounis, Paolo Campodonico, Mihaela van der Schaar et al.

We introduce the Consistent Reasoning Paradox (CRP). Consistent reasoning, which lies at the core of human intelligence, is the ability to handle tasks that are equivalent, yet described by different sentences ('Tell me the time!' and 'What is the time?'). The CRP asserts that consistent reasoning implies fallibility -- in particular, human-like intelligence in AI necessarily comes with human-like fallibility. Specifically, it states that there are problems, e.g. in basic arithmetic, where any AI that always answers and strives to mimic human intelligence by reasoning consistently will hallucinate (produce wrong, yet plausible answers) infinitely often. The paradox is that there exists a non-consistently reasoning AI (which therefore cannot be on the level of human intelligence) that will be correct on the same set of problems. The CRP also shows that detecting these hallucinations, even in a probabilistic sense, is strictly harder than solving the original problems, and that there are problems that an AI may answer correctly, but it cannot provide a correct logical explanation for how it arrived at the answer. Therefore, the CRP implies that any trustworthy AI (i.e., an AI that never answers incorrectly) that also reasons consistently must be able to say 'I don't know'. Moreover, this can only be done by implicitly computing a new concept that we introduce, termed the 'I don't know' function -- something currently lacking in modern AI. In view of these insights, the CRP also provides a glimpse into the behaviour of Artificial General Intelligence (AGI). An AGI cannot be 'almost sure', nor can it always explain itself, and therefore to be trustworthy it must be able to say 'I don't know'.

LGSep 13, 2023
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

Alexander Bastounis, Alexander N. Gorban, Anders C. Hansen et al.

In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures.

NAOct 22, 2016
Complexity Issues in Computing Spectra, Pseudospectra and Resolvents

Anders C. Hansen, Olavi Nevanlinna

We display methods that allow for computations of spectra, pseudospectra and resolvents of linear operators on Hilbert spaces and also elements in unital Banach algebras. The paper considers two different approaches, namely, pseudospectral techniques and polynomial numerical hull theory. The former is used for Hilbert space operators whereas the latter can handle the general case of elements in a Banach algebra. This approach leads to multicentric holomorphic calculus. We also discuss some new types of pseudospectra and the recently defined Solvability Complexity Index

LGJul 13, 2023
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks

Johan S. Wind, Vegard Antun, Anders C. Hansen

Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA). GHA generalizes the phenomenon of hardness of approximation from computer science to, among others, continuous and robust optimization. It is well-known that the $\ell^1$-norm of the gradient flow of DLNs with tiny initialization converges to the objective function of basis pursuit. We improve upon these results by showing that the gradient flow of DLNs with tiny initialization approximates minimizers of the basis pursuit optimization problem (as opposed to just the objective function), and we obtain new and sharp convergence bounds w.r.t.\ the initialization size. Non-sharpness of our results would imply that the GHA phenomenon would not occur for the basis pursuit optimization problem -- which is a contradiction -- thus implying sharpness. Moreover, we characterize $\textit{which}$ $\ell_1$ minimizer of the basis pursuit problem is chosen by the gradient flow whenever the minimizer is not unique. Interestingly, this depends on the depth of the DLN.

MLMay 13
On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment Methods

David Iagaru, Nina M. Gottschling, Anders C. Hansen et al.

Artificial intelligence (AI) has transformed imaging inverse problems, from medical diagnostics to Earth observation. Yet deep neural networks can produce hallucinations, realistic-looking but incorrect details, undermining their reliability, especially when ground truth data is unavailable. We develop a theoretical framework showing that such hallucinations are not merely artifacts of particular models, but can arise from the ill-posed nature of the inverse problem itself. We derive necessary and sufficient conditions for hallucinations, together with computable bounds on their magnitude that depend only on the forward model. Building on this theory, we introduce algorithms to: (1) estimate the minimum hallucination magnitude achievable by any reconstruction model for a given input; (2) assess the faithfulness of reconstructed details by a given reconstruction model. Experiments across three imaging tasks demonstrate that our approach applies broadly, including to modern generative models, and provides a principled way to quantify and evaluate AI hallucinations.

OCDec 18, 2023
When can you trust feature selection? -- I: A condition-based analysis of LASSO and generalised hardness of approximation

Alexander Bastounis, Felipe Cucker, Anders C. Hansen

The arrival of AI techniques in computations, with the potential for hallucinations and non-robustness, has made trustworthiness of algorithms a focal point. However, trustworthiness of the many classical approaches are not well understood. This is the case for feature selection, a classical problem in the sciences, statistics, machine learning etc. Here, the LASSO optimisation problem is standard. Despite its widespread use, it has not been established when the output of algorithms attempting to compute support sets of minimisers of LASSO in order to do feature selection can be trusted. In this paper we establish how no (randomised) algorithm that works on all inputs can determine the correct support sets (with probability $> 1/2$) of minimisers of LASSO when reading approximate input, regardless of precision and computing power. However, we define a LASSO condition number and design an efficient algorithm for computing these support sets provided the input data is well-posed (has finite condition number) in time polynomial in the dimensions and logarithm of the condition number. For ill-posed inputs the algorithm runs forever, hence, it will never produce a wrong answer. Furthermore, the algorithm computes an upper bound for the condition number when this is finite. Finally, for any algorithm defined on an open set containing a point with infinite condition number, there is an input for which the algorithm will either run forever or produce a wrong answer. Our impossibility results stem from generalised hardness of approximation -- within the Solvability Complexity Index (SCI) hierarchy framework -- that generalises the classical phenomenon of hardness of approximation.

LGJan 20, 2021
Can stable and accurate neural networks be computed? -- On the barriers of deep learning and Smale's 18th problem

Matthew J. Colbrook, Vegard Antun, Anders C. Hansen

Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities, however, there does not exist any algorithm, even randomised, that can train (or compute) such a NN. For any positive integers $K > 2$ and $L$, there are cases where simultaneously: (a) no randomised training algorithm can compute a NN correct to $K$ digits with probability greater than $1/2$, (b) there exists a deterministic training algorithm that computes a NN with $K-1$ correct digits, but any such (even randomised) algorithm needs arbitrarily many training data, (c) there exists a deterministic training algorithm that computes a NN with $K-2$ correct digits using no more than $L$ training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce Fast Iterative REstarted NETworks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only $\mathcal{O}(|\log(ε)|)$ layers are needed for an $ε$-accurate solution to the inverse problem.

LGJan 5, 2020
The troublesome kernel -- On hallucinations, no free lunches and the accuracy-stability trade-off in inverse problems

Nina M. Gottschling, Vegard Antun, Anders C. Hansen et al.

Methods inspired by Artificial Intelligence (AI) are starting to fundamentally change computational science and engineering through breakthrough performances on challenging problems. However, reliability and trustworthiness of such techniques is a major concern. In inverse problems in imaging, the focus of this paper, there is increasing empirical evidence that methods may suffer from hallucinations, i.e., false, but realistic-looking artifacts; instability, i.e., sensitivity to perturbations in the data; and unpredictable generalization, i.e., excellent performance on some images, but significant deterioration on others. This paper provides a theoretical foundation for these phenomena. We give mathematical explanations for how and when such effects arise in arbitrary reconstruction methods, with several of our results taking the form of `no free lunch' theorems. Specifically, we show that (i) methods that overperform on a single image can wrongly transfer details from one image to another, creating a hallucination, (ii) methods that overperform on two or more images can hallucinate or be unstable, (iii) optimizing the accuracy-stability trade-off is generally difficult, (iv) hallucinations and instabilities, if they occur, are not rare events, and may be encouraged by standard training, (v) it may be impossible to construct optimal reconstruction maps for certain problems. Our results trace these effects to the kernel of the forward operator whenever it is nontrivial, but also apply to the case when the forward operator is ill-conditioned. Based on these insights, our work aims to spur research into new ways to develop robust and reliable AI-based methods for inverse problems in imaging.

MLJun 4, 2019
What do AI algorithms actually learn? - On false structures in deep learning

Laura Thesing, Vegard Antun, Anders C. Hansen

There are two big unsolved mathematical questions in artificial intelligence (AI): (1) Why is deep learning so successful in classification problems and (2) why are neural nets based on deep learning at the same time universally unstable, where the instabilities make the networks vulnerable to adversarial attacks. We present a solution to these questions that can be summed up in two words; false structures. Indeed, deep learning does not learn the original structures that humans use when recognising images (cats have whiskers, paws, fur, pointy ears, etc), but rather different false structures that correlate with the original structure and hence yield the success. However, the false structure, unlike the original structure, is unstable. The false structure is simpler than the original structure, hence easier to learn with less data and the numerical algorithm used in the training will more easily converge to the neural network that captures the false structure. We formally define the concept of false structures and formulate the solution as a conjecture. Given that trained neural networks always are computed with approximations, this conjecture can only be established through a combination of theoretical and computational results similar to how one establishes a postulate in theoretical physics (e.g. the speed of light is constant). Establishing the conjecture fully will require a vast research program characterising the false structures. We provide the foundations for such a program establishing the existence of the false structures in practice. Finally, we discuss the far reaching consequences the existence of the false structures has on state-of-the-art AI and Smale's 18th problem.

CVFeb 14, 2019
On instabilities of deep learning in image reconstruction - Does AI come at a cost?

Vegard Antun, Francesco Renna, Clarice Poon et al.

Deep learning, due to its unprecedented success in tasks such as image classification, has emerged as a new tool in image reconstruction with potential to change the field. In this paper we demonstrate a crucial phenomenon: deep learning typically yields unstablemethods for image reconstruction. The instabilities usually occur in several forms: (1) tiny, almost undetectable perturbations, both in the image and sampling domain, may result in severe artefacts in the reconstruction, (2) a small structural change, for example a tumour, may not be captured in the reconstructed image and (3) (a counterintuitive type of instability) more samples may yield poorer performance. Our new stability test with algorithms and easy to use software detects the instability phenomena. The test is aimed at researchers to test their networks for instabilities and for government agencies, such as the Food and Drug Administration (FDA), to secure safe use of deep learning methods.

NASep 9, 2016
Density theorems for nonuniform sampling of bandlimited functions using derivatives or bunched measurements

Ben Adcock, Milana Gataric, Anders C. Hansen

We provide sufficient density condition for a set of nonuniform samples to give rise to a set of sampling for multivariate bandlimited functions when the measurements consist of pointwise evaluations of a function and its first $k$ derivatives. Along with explicit estimates of corresponding frame bounds, we derive the explicit density bound and show that, as $k$ increases, it grows linearly in $k+1$ with the constant of proportionality $1/\mathrm{e}$. Seeking larger gap conditions, we also prove a multivariate perturbation result for nonuniform samples that are sufficiently close to sets of sampling, e.g. to uniform samples taken at $k+1$ times the Nyquist rate. Additionally, in the univariate setting, we consider a related problem of so-called nonuniform bunched sampling, where in each sampling interval $s+1$ bunched measurements of a function are taken and the sampling intervals are permitted to be of different length. We derive an explicit density condition which grows linearly in $s+1$ for large $s$, with the constant of proportionality depending on the width of the bunches. The width of the bunches is allowed to be arbitrarily small, and moreover, for sufficiently narrow bunches and sufficiently large $s$, we obtain the same result as in the case of univariate sampling with $s$ derivatives.