NAMay 20, 2018
Data-driven polynomial ridge approximation using variable projectionJeffrey M. Hokanson, Paul G. Constantine
Inexpensive surrogates are useful for reducing the cost of science and engineering studies involving large-scale, complex computational models with many input parameters. A ridge approximation is one class of surrogate that models a quantity of interest as a nonlinear function of a few linear combinations of the input parameters. When used in parameter studies (e.g., optimization or uncertainty quantification), ridge approximations allow the low dimensional structure to be exploited, reducing the effective dimension. We introduce a new, fast algorithm for constructing a ridge approximation where the nonlinear function is a polynomial. This polynomial ridge approximation is chosen to minimize least squared mismatch between the surrogate and the quantity of interest on a given set of inputs. Naively, this would require optimizing both the polynomial coefficients and the linear combination of weights; the latter of which define a low-dimensional subspace of the input space. However, given a fixed subspace the optimal polynomial can be found by solving a linear least-squares problem, and hence by using variable projection the polynomial can be implicitly found leaving an optimization problem over the subspace alone. We provide an algorithm that finds this polynomial ridge approximation by minimizing over the Grassmann manifold of low-dimensional subspaces using a Gauss-Newton method. We provide details of this optimization algorithm and demonstrate its performance on several numerical examples. Our Gauss-Newton method has superior theoretical guarantees and faster convergence than the alternating approach for polynomial ridge approximation earlier proposed by Constantine, Eftekhari, Hokanson, and Ward [https://doi.org/10.1016/j.cma.2017.07.038] that alternates between (i) optimizing the polynomial coefficients given the subspace and (ii) optimizing the subspace given the coefficients.
NANov 30, 2018
Least Squares Rational ApproximationJeffrey M. Hokanson, Caleb C. Magruder
Rational approximation appears in many contexts throughout science and engineering, playing a central role in linear systems theory, special function approximation, and many others. There are many existing methods for solving the rational approximation problem, from fixed point methods like the Sanathanan-Koerner iteration and Vector Fitting, to partial interpolation methods like Adaptive Anderson Antoulas (AAA). While these methods can often find rational approximations with a small residual norm, they are unable to find optimizers with respect to a weighted l2 norm with a square dense weighting matrix. Here we develop a nonlinear least squares approach constructing rational approximations with respect to this norm. We explore this approach using two parameterizations of rational functions: a ratio of two polynomials and a partial fraction expansion. In both cases, we show how we can use Variable Projection (VARPRO) to reduce the dimension of the optimization problem. As many applications seek a real rational approximation that can be described as a ratio of two real polynomials, we show how this constraint can be enforced in both parameterizations. Although this nonlinear least squares approach often converge to suboptimal local minimizers, we find this can be largely mitigated by initializing the algorithm using the poles of the AAA algorithm applied to the same data. This combination of initialization and nonlinear least squares enables us to construct rational approximants using dense and potentially ill-conditioned weight matrices such as those that appear as a step in new H2 model reduction algorithm recently developed by the authors.
LGOct 25, 2019
A Numerical Investigation of the Minimum Width of a Neural NetworkIbrohim Nosirov, Jeffrey M. Hokanson
Neural network width and depth are fundamental aspects of network topology. Universal approximation theorems provide that with increasing width or depth, there exists a neural network that approximates a function arbitrarily well. These theorems assume requirements, such as infinite data, that must be discretized in practice. Through numerical experiments, we seek to test the lower bounds established by Hanin in 2017.
NAJun 6, 2017
Projected nonlinear least squares for exponential fittingJeffrey M. Hokanson
The modern ability to collect vast quantities of data poses a challenge for parameter estimation problems. When posed as a nonlinear least squares problem fitting a model to data, the cost of each iteration grows linearly with the amount of data and it can easily become prohibitively expensive to perform many iterations. Here we develop an approach that projects the data onto a low-dimensional subspace of the high-dimensional data that preserves the information in the original data. We provide results from both optimization and statistical perspectives showing that the information is preserved when the subspace angles between this projection and the Jacobian of the model at the current iterate remain small. However, for this approach to reduce computational complexity, both the projected model and Jacobian must be computed inexpensively. This is a constraint on the pairs of models and subspaces for which this approach provides a computational speedup. Here we consider the exponential fitting problem projected onto the range of Vandermonde matrix, for which the projected model and Jacobian can be computed in closed form using a generalized geometric sum formula. We further provide an inexpensive heuristic that picks this Vandermonde matrix so that the subspace angles with the Jacobian remain small and use this heuristic to update the subspace during optimization. Although the asymptotic cost still depends on the data dimension, the overall cost of this sequence of projected nonlinear least squares problems is less expensive than the original nonlinear least squares problem. Applied to the exponential fitting problem, this provides an algorithm that is not only faster in the limit of large data than the conventional nonlinear least squares approach, but is also faster than subspace based approaches such as HSVD.