NAFeb 18, 2016
Polynomial approximation via compressed sensing of high-dimensional functions on lower setsAbdellah Chkifa, Nick Dexter, Hoang Tran et al.
This work proposes and analyzes a compressed sensing approach to polynomial approximation of complex-valued functions in high dimensions. Of particular interest is the setting where the target function is smooth, characterized by a rapidly decaying orthonormal expansion, whose most important terms are captured by a lower (or downward closed) set. By exploiting this fact, we present an innovative weighted $\ell_1$ minimization procedure with a precise choice of weights, and a new iterative hard thresholding method, for imposing the downward closed preference. Theoretical results reveal that our computational approaches possess a provably reduced sample complexity compared to existing compressed sensing techniques presented in the literature. In addition, the recovery of the corresponding best approximation using these methods is established through an improved bound for the restricted isometry property. Our analysis represents an extension of the approach for Hadamard matrices in [5] to the general case of continuous bounded orthonormal systems, quantifies the dependence of sample complexity on the successful recovery probability, and provides an estimate on the number of measurements with explicit constants. Numerical examples are provided to support the theoretical results and demonstrate the computational efficiency of the novel weighted $\ell_1$ minimization strategy.
LGNov 21, 2022
Examining Policy Entropy of Reinforcement Learning Agents for Personalization TasksAnton Dereventsov, Andrew Starnes, Clayton G. Webster
This effort is focused on examining the behavior of reinforcement learning systems in personalization environments and detailing the differences in policy entropy associated with the type of learning algorithm utilized. We demonstrate that Policy Optimization agents often possess low-entropy policies during training, which in practice results in agents prioritizing certain actions and avoiding others. Conversely, we also show that Q-Learning agents are far less susceptible to such behavior and generally maintain high-entropy policies throughout training, which is often preferable in real-world applications. We provide a wide range of numerical experiments as well as theoretical justification to show that these differences in entropy are due to the type of learning being employed.
OCJun 18, 2020
An adaptive stochastic gradient-free approach for high-dimensional blackbox optimizationAnton Dereventsov, Clayton G. Webster, Joseph D. Daws
In this work, we propose a novel adaptive stochastic gradient-free (ASGF) approach for solving high-dimensional nonconvex optimization problems based on function evaluations. We employ a directional Gaussian smoothing of the target function that generates a surrogate of the gradient and assists in avoiding bad local optima by utilizing nonlocal information of the loss landscape. Applying a deterministic quadrature scheme results in a massively scalable technique that is sample-efficient and achieves spectral accuracy. At each step we randomly generate the search directions while primarily following the surrogate of the smoothed gradient. This enables exploitation of the gradient direction while maintaining sufficient space exploration, and accelerates convergence towards the global extrema. In addition, we make use of a local approximation of the Lipschitz constant in order to adaptively adjust the values of all hyperparameters, thus removing the careful fine-tuning of current algorithms that is often necessary to be successful when applied to a large class of learning tasks. As such, the ASGF strategy offers significant improvements when solving high-dimensional nonconvex optimization problems when compared to other gradient-free methods (including the so called "evolutionary strategies") as well as iterative approaches that rely on the gradient information of the objective function. We illustrate the improved performance of this method by providing several comparative numerical studies on benchmark global optimization problems and reinforcement learning tasks.
NAApr 13, 2020
Analysis of The Ratio of $\ell_1$ and $\ell_2$ Norms in Compressed SensingYiming Xu, Akil Narayan, Hoang Tran et al.
We first propose a novel criterion that guarantees that an $s$-sparse signal is the local minimizer of the $\ell_1/\ell_2$ objective; our criterion is interpretable and useful in practice. We also give the first uniform recovery condition using a geometric characterization of the null space of the measurement matrix, and show that this condition is easily satisfied for a class of random matrices. We also present analysis on the robustness of the procedure when noise pollutes data. Numerical experiments are provided that compare $\ell_1/\ell_2$ with some other popular non-convex methods in compressed sensing. Finally, we propose a novel initialization approach to accelerate the numerical optimization procedure. We call this initialization approach \emph{support selection}, and we demonstrate that it empirically improves the performance of existing $\ell_1/\ell_2$ algorithms.
CVSep 20, 2019
A nonlocal feature-driven exemplar-based approach for image inpaintingViktor Reshniak, Jeremy Trageser, Clayton G. Webster
We present a nonlocal variational image completion technique which admits simultaneous inpainting of multiple structures and textures in a unified framework. The recovery of geometric structures is achieved by using general convolution operators as a measure of behavior within an image. These are combined with a nonlocal exemplar-based approach to exploit the self-similarity of an image in the selected feature domains and to ensure the inpainting of textures. We also introduce an anisotropic patch distance metric to allow for better control of the feature selection within an image and present a nonlocal energy functional based on this metric. Finally, we derive an optimization algorithm for the proposed variational model and examine its validity experimentally with various test images.
LGMay 24, 2019
A Polynomial-Based Approach for Architectural Design and Learning with Deep Neural NetworksJoseph Daws, Clayton G. Webster
In this effort we propose a novel approach for reconstructing multivariate functions from training data, by identifying both a suitable network architecture and an initialization using polynomial-based approximations. Training deep neural networks using gradient descent can be interpreted as moving the set of network parameters along the loss landscape in order to minimize the loss functional. The initialization of parameters is important for iterative training methods based on descent. Our procedure produces a network whose initial state is a polynomial representation of the training data. The major advantage of this technique is from this initialized state the network may be improved using standard training procedures. Since the network already approximates the data, training is more likely to produce a set of parameters associated with a desirable local minimum. We provide the details of the theory necessary for constructing such networks and also consider several numerical examples that reveal our approach ultimately produces networks which can be effectively trained from our initialized state to achieve an improved approximation for a large class of target functions.
NAJul 7, 2017
On the Lebesgue Constant of Weighted Leja Points for Lagrange Interpolation on Unbounded DomainsPeter Jantsch, Clayton G. Webster, Guannan Zhang
This work focuses on weighted Lagrange interpolation on an unbounded domain, and analyzes the Lebesgue constant for a sequence of weighted Leja points. The standard Leja points are a nested sequence of points defined on a compact subset of the real line, and can be extended to unbounded domains with the introduction of a weight function $w:\mathbb{R}\rightarrow [0,1]$. Due to a simple recursive formulation in one dimension, such abscissas provide a foundation for high-dimensional approximation methods such as sparse grid collocation, deterministic least squares, and compressed sensing. Just as in the unweighted case of interpolation on a compact domain, we use results from potential theory to prove that the Lebesgue constant for the Leja points grows subexponentially with the number of interpolation nodes.
NAJun 9, 2017
Compressed sensing approaches for polynomial approximation of high-dimensional functionsBen Adcock, Simone Brugiapaglia, Clayton G. Webster
In recent years, the use of sparse recovery techniques in the approximation of high-dimensional functions has garnered increasing interest. In this work we present a survey of recent progress in this emerging topic. Our main focus is on the computation of polynomial approximations of high-dimensional functions on $d$-dimensional hypercubes. We show that smooth, multivariate functions possess expansions in orthogonal polynomial bases that are not only approximately sparse, but possess a particular type of structured sparsity defined by so-called lower sets. This structure can be exploited via the use of weighted $\ell^1$ minimization techniques, and, as we demonstrate, doing so leads to sample complexity estimates that are at most logarithmically dependent on the dimension $d$. Hence the curse of dimensionality - the bane of high-dimensional approximation - is mitigated to a significant extent. We also discuss several practical issues, including unknown noise (due to truncation or numerical error), and highlight a number of open problems and challenges.
NAAug 6, 2015
A sparse grid method for Bayesian uncertainty quantification with application to large eddy simulation turbulence modelsHoang A. Tran, Clayton G. Webster, Guannan Zhang
There is wide agreement that the accuracy of turbulence models suffer from their sensitivity with respect to physical input data, the uncertainties of user-elected parameters, as well as the model inadequacy. However, the application of Bayesian inference to systematically quantify the uncertainties in parameters, by means of exploring posterior probability density functions (PPDFs), has been hindered by the prohibitively daunting computational cost associated with the large number of model executions, in addition to daunting computation time per one turbulence simulation. In this effort, we perform in this paper an adaptive hierarchical sparse grid surrogate modeling approach to Bayesian inference of large eddy simulation (LES). First, an adaptive hierarchical sparse grid surrogate for the output of forward models is constructed using a relatively small number of model executions. Using such surrogate, the likelihood function can be rapidly evaluated at any point in the parameter space without simulating the computationally expensive LES model. This method is essentially similar to those developed in [62] for geophysical and groundwater models, but is adjusted and applied here for a much more challenging problem of uncertainty quantification of turbulence models. Through a numerical demonstration of the Smagorinsky model of two-dimensional flow around a cylinder at sub-critical Reynolds number, our approach is proven to significantly reduce the number of costly LES executions without losing much accuracy in the posterior probability estimation. Here, the model parameters are calibrated against synthetic data related to the mean flow velocity and Reynolds stresses at different locations in the flow wake. The influence of the user-elected LES parameters on the quality of output data will be discussed.
NAAug 5, 2015
A Dynamically Adaptive Sparse Grid Method for Quasi-Optimal Interpolation of Multidimensional Analytic FunctionsMiroslav K. Stoyanov, Clayton G. Webster
In this work we develop a dynamically adaptive sparse grids (SG) method for quasi-optimal interpolation of multidimensional analytic functions defined over a product of one dimensional bounded domains. The goal of such approach is to construct an interpolant in space that corresponds to the "best $M$-terms" based on sharp a priori estimate of polynomial coefficients. In the past, SG methods have been successful in achieving this, with a traditional construction that relies on the solution to a Knapsack problem: only the most profitable hierarchical surpluses are added to the SG. However, this approach requires additional sharp estimates related to the size of the analytic region and the norm of the interpolation operator, i.e., the Lebesgue constant. Instead, we present an iterative SG procedure that adaptively refines an estimate of the region and accounts for the effects of the Lebesgue constant. Our approach does not require any a priori knowledge of the analyticity or operator norm, is easily generalized to both affine and non-affine analytic functions, and can be applied to sparse grids build from one dimensional rules with arbitrary growth of the number of nodes. In several numerical examples, we utilize our dynamically adaptive SG to interpolate quantities of interest related to the solutions of parametrized elliptic and hyperbolic PDEs, and compare the performance of our quasi-optimal interpolant to several alternative SG schemes.
NAMay 4, 2015
Accelerating stochastic collocation methods for partial differential equations with random input dataDiego Galindo, Peter Jantsch, Clayton G. Webster et al.
This work proposes and analyzes a generalized acceleration technique for decreasing the computational complexity of using stochastic collocation (SC) methods to solve partial differential equations (PDEs) with random input data. The SC approaches considered in this effort consist of sequentially constructed multi-dimensional Lagrange interpolant in the random parametric domain, formulated by collocating on a set of points so that the resulting approximation is defined in a hierarchical sequence of polynomial spaces of increasing fidelity. Our acceleration approach exploits the construction of the SC interpolant to accelerate the underlying ensemble of deterministic solutions. Specifically, we predict the solution of the parametrized PDE at each collocation point on the current level of the SC approximation by evaluating each sample with a previously assembled lower fidelity interpolant, and then use such predictions to provide deterministic (linear or nonlinear) iterative solvers with improved initial approximations. As a concrete example, we develop our approach in the context of SC approaches that employ sparse tensor products of globally defined Lagrange polynomials on nested one-dimensional Clenshaw-Curtis abscissas. This work also provides a rigorous computational complexity analysis of the resulting fully discrete sparse grid SC approximation, with and without acceleration, which demonstrates the effectiveness of our proposed methodology in reducing the total number of iterations of a conjugate gradient solution of the finite element systems at each collocation point. Numerical examples include both linear and nonlinear parametrized PDEs, which are used to illustrate the theoretical results and the improved efficiency of this technique compared with several others.