COFeb 27, 2015
Fast Bayesian Optimal Experimental Design for Seismic Source InversionQuan Long, Mohammad Motamed, Raul Tempone
We develop a fast method for optimally designing experiments in the context of statistical seismic source inversion. In particular, we efficiently compute the optimal number and locations of the receivers or seismographs. The seismic source is modeled by a point moment tensor multiplied by a time-dependent function. The parameters include the source location, moment tensor components, and start time and frequency in the time function. The forward problem is modeled by elastodynamic wave equations. We show that the Hessian of the cost functional, which is usually defined as the square of the weighted L2 norm of the difference between the experimental data and the simulated data, is proportional to the measurement time and the number of receivers. Consequently, the posterior distribution of the parameters, in a Bayesian setting, concentrates around the "true" parameters, and we can employ Laplace approximation and speed up the estimation of the expected Kullback-Leibler divergence (expected information gain), the optimality criterion in the experimental design procedure. Since the source parameters span several magnitudes, we use a scaling matrix for efficient control of the condition number of the original Hessian matrix. We use a second-order accurate finite difference method to compute the Hessian matrix and either sparse quadrature or Monte Carlo sampling to carry out numerical integration. We demonstrate the efficiency, accuracy, and applicability of our method on a two-dimensional seismic source inversion problem.
NADec 26, 2018
Wasserstein metric-driven Bayesian inversion with applications to signal processingMohammad Motamed, Daniel Appelo
We present a Bayesian framework based on a new exponential likelihood function driven by the quadratic Wasserstien metric. Compared to conventional Bayesian models based on Gaussian likelihood functions driven by the least-squares norm ($L_2$ norm), the new framework features several advantages. First, the new framework does not rely on the likelihood of the measurement noise and hence can treat complicated noise structures such as combined additive and multiplicative noise. Secondly, unlike the normal likelihood function, the Wasserstein-based exponential likelihood function does not usually generate multiple local extrema. As a result, the new framework features better convergence to correct posteriors when a Markov Chain Monte Carlo sampling algorithm is employed. Thirdly, in the particular case of signal processing problems, while a normal likelihood function measures only the amplitude differences between the observed and simulated signals, the new likelihood function can capture both the amplitude and the phase differences. We apply the new framework to a class of signal processing problems, that is, the inverse uncertainty quantification of waveforms, and demonstrate its advantages compared to Bayesian models with normal likelihood functions.
LGOct 5, 2023
Residual Multi-Fidelity Neural Network ComputingOwen Davis, Mohammad Motamed, Raul Tempone
In this work, we consider the general problem of constructing a neural network surrogate model using multi-fidelity information. Motivated by error-complexity estimates for ReLU neural networks, we formulate the correlation between an inexpensive low-fidelity model and an expensive high-fidelity model as a possibly non-linear residual function. This function defines a mapping between 1) the shared input space of the models along with the low-fidelity model output, and 2) the discrepancy between the outputs of the two models. The computational framework proceeds by training two neural networks to work in concert. The first network learns the residual function on a small set of high- and low-fidelity data. Once trained, this network is used to generate additional synthetic high-fidelity data, which is used in the training of the second network. The trained second network then acts as our surrogate for the high-fidelity quantity of interest. We present four numerical examples to demonstrate the power of the proposed framework, showing that significant savings in computational cost may be achieved when the output predictions are desired to be accurate within small tolerances.
LGJul 19, 2022
Approximation Power of Deep Neural Networks: an explanatory mathematical surveyOwen Davis, Mohammad Motamed
This survey provides an in-depth and explanatory review of the approximation properties of deep neural networks, with a focus on feed-forward and residual architectures. The primary objective is to examine how effectively neural networks approximate target functions and to identify conditions under which they outperform traditional approximation methods. Key topics include the nonlinear, compositional structure of deep networks and the formalization of neural network tasks as optimization problems in regression and classification settings. The survey also addresses the training process, emphasizing the role of stochastic gradient descent and backpropagation in solving these optimization problems, and highlights practical considerations such as activation functions, overfitting, and regularization techniques. Additionally, the survey explores the density of neural networks in the space of continuous functions, comparing the approximation capabilities of deep ReLU networks with those of other approximation methods. It discusses recent theoretical advancements in understanding the expressiveness and limitations of these networks. A detailed error-complexity analysis is also presented, focusing on error rates and computational complexity for neural networks with ReLU and Fourier-type activation functions in the context of bounded target functions with minimal regularity assumptions. Alongside recent known results, the survey introduces new findings, offering a valuable resource for understanding the theoretical foundations of neural network approximation. Concluding remarks and further reading suggestions are provided.
LGJul 16, 2024
Deep Learning without Global Optimization by Random Fourier Neural NetworksOwen Davis, Gianluca Geraci, Mohammad Motamed
We introduce a new training algorithm for deep neural networks that utilize random complex exponential activation functions. Our approach employs a Markov Chain Monte Carlo sampling procedure to iteratively train network layers, avoiding global and gradient-based optimization while maintaining error control. It consistently attains the theoretical approximation rate for residual networks with complex exponential activation functions, determined by network complexity. Additionally, it enables efficient learning of multiscale and high-frequency features, producing interpretable parameter distributions. Despite using sinusoidal basis functions, we do not observe Gibbs phenomena in approximating discontinuous target functions.
62.3NAMay 5
Fourier Residual Networks Achieve Spectral Accuracy for Discontinuous FunctionsOwen Davis, Mohammad Motamed, Olof Runborg
We present a constructive approximation framework for analyzing the expressive power of Fourier residual networks in approximating a broad class of one-dimensional functions. Our study covers both piecewise continuous functions -- including those with jump discontinuities in the function and its derivatives -- and fully smooth functions. We show that Fourier residual networks achieve spectral convergence without requiring periodicity or continuity, thereby overcoming key limitations of classical linear Fourier approximation and nonlinear methods, without being restricted to Barron-type function spaces. Our approach builds on classical techniques from approximation theory, including fixed-point iteration and Hermite interpolation by trigonometric polynomials. We support our theoretical results with numerical experiments based on both the constructed approximations and a randomized algorithm developed in our earlier work.
MLMay 10, 2024
Approximation Error and Complexity Bounds for ReLU Networks on Low-Regular Function SpacesOwen Davis, Gianluca Geraci, Mohammad Motamed
In this work, we consider the approximation of a large class of bounded functions, with minimal regularity assumptions, by ReLU neural networks. We show that the approximation error can be bounded from above by a quantity proportional to the uniform norm of the target function and inversely proportional to the product of network width and depth. We inherit this approximation error bound from Fourier features residual networks, a type of neural network that uses complex exponential activation functions. Our proof is constructive and proceeds by conducting a careful complexity analysis associated with the approximation of a Fourier features residual network by a ReLU network.
NAOct 13, 2015
A Fuzzy-Stochastic Multiscale Model for Fiber Composites: A one-dimensional studyIvo Babuska, Mohammad Motamed
We study mathematical and computational models for computing the deformation of fiber-reinforced cross-plied laminates due to external forces. This requires an understanding of both micro-structural effects and different sources of uncertainty in the problem. We first show that the uncertainties in the problem are of both statistical (aleatoric) and systematic (epistemic) types and that current multiscale stochastic models, such as stationary random fields, which are based on precise probability theory, are not capable of correctly characterizing uncertainty in fiber composites. Next, we motivate the applicability of models based on imprecise uncertainty theory and present a novel fuzzy-stochastic model, which can more accurately describe uncertainties in fiber composites. The new model is constructed by combining stochastic fields and fuzzy variables through a simple calibration-validation approach. Finally, we construct a global-local multiscale algorithm for efficiently computing output quantities of interest. The method aims at approximating required quantities, such as displacements and stresses, in regions of relatively small size, e.g. hot spots or zones. The algorithm uses the concept of representative volume elements and computes a global solution to construct a local approximation that captures the microscale features of the solution. The results are based on and backed by real experimental data.
NASep 10, 2015
A Sparse Stochastic Collocation Technique for High-Frequency Wave Propagation with UncertaintyGabriela Malenova, Mohammad Motamed, Olof Runborg et al.
We consider the wave equation with highly oscillatory initial data, where there is uncertainty in the wave speed, initial phase and/or initial amplitude. To estimate quantities of interest related to the solution and their statistics, we combine a high-frequency method based on Gaussian beams with sparse stochastic collocation. Although the wave solution, $u^\varepsilon$, is highly oscillatory in both physical and stochastic spaces, we provide theoretical arguments and numerical evidence that quantities of interest based on local averages of $|u^\varepsilon|^2$ are smooth, with derivatives in the stochastic space uniformly bounded in $\varepsilon$, where $\varepsilon$ denotes the short wavelength. This observable related regularity makes the sparse stochastic collocation approach more efficient than Monte Carlo methods. We present numerical tests that demonstrate this advantage.
NAFeb 4, 2010
Taylor Expansion and Discretization Errors in Gaussian Beam SuperpositionMohammad Motamed, Olof Runborg
The Gaussian beam superposition method is an asymptotic method for computing high frequency wave fields in smoothly varying inhomogeneous media. In this paper we study the accuracy of the Gaussian beam superposition method and derive error estimates related to the discretization of the superposition integral and the Taylor expansion of the phase and amplitude off the center of the beam. We show that in the case of odd order beams, the error is smaller than a simple analysis would indicate because of error cancellation effects between the beams. Since the cancellation happens only when odd order beams are used, there is no remarkable gain in using even order beams. Moreover, applying the error estimate to the problem with constant speed of propagation, we show that in this case the local beam width is not a good indicator of accuracy, and there is no direct relation between the error and the beam width. We present numerical examples to verify the error estimates.