NADec 18, 2016
Modified Virtual Grid Difference for Discretizing the Laplace-Beltrami Operator on Point CloudsMeng Wang, Shingyu Leung, Hongkai Zhao
We propose a new and simple discretization, named the Modified Virtual Grid Difference (MVGD), for numerical approximation of the Laplace-Beltrami (LB) operator on manifolds sampled by point clouds. The key observation is that both the manifold and a function defined on it can both be parametrized in a local Cartesian coordinate system and approximated using least squares. Based on the above observation, we first introduce a local virtual grid with a scale adapted to the sampling density centered at each point. Then we propose a modified finite difference scheme on the virtual grid to discretize the LB operator. Instead of using the local least squares values on all virtual grid points like the typical finite difference method, we use the function value explicitly at the grid located at the center (coincided with the data point). The new discretization provides more diagonal dominance to the resulting linear system and improves its conditioning. We show that the linear system can be robustly, efficiently and accurately solved by existing fast solver such as the Algebraic Multigrid (AMG) method. We will present numerical tests and comparison with other exiting methods to demonstrate the effectiveness and the performance of the proposed approach.
NAJan 19, 2011
A New Numerical Algorithm for Thermoacoustic and Photoacoustic Tomography with Variable Sound SpeedJianliang Qian, Plamen Stefanov, Gunther Uhlmann et al.
We present a new algorithm for reconstructing an unknown source in Thermoacoustic and Photoacoustic Tomography based on the recent advances in understanding the theoretical nature of the problem. We work with variable sound speeds that might be also discontinuous across some surface. The latter problem arises in brain imaging. The new algorithm is based on an explicit formula in the form of a Neumann series. We present numerical examples with non-trapping, trapping and piecewise smooth speeds, as well as examples with data on a part of the boundary. These numerical examples demonstrate the robust performance of the new algorithm.
NAJul 7, 2016
Fast alternating bi-directional preconditioner for the 2D high-frequency Lippmann-Schwinger equationLeonardo Zepeda-Núñez, Hongkai Zhao
This paper presents a fast iterative solver for Lippmann-Schwinger equation for high-frequency waves scattered by a smooth medium with a compactly supported inhomogeneity. The solver is based on the sparsifying preconditioner and a domain decomposition approach similar to the method of polarized traces. The iterative solver has two levels, the outer level in which a sparsifying preconditioner for the Lippmann-Schwinger equation is constructed, and the inner level, in which the resulting sparsified system is solved fast using an iterative solver preconditioned with a bi-directional matrix-free variant of the method of polarized traces. The complexity of the construction and application of the preconditioner is $\mathcal{O}(N)$ and $\mathcal{O}(N\log{N})$ respectively, where $N$ is the number of degrees of freedom. Numerical experiments in 2D indicate that the number of iterations in both levels depends weakly on the frequency resulting in method with an overall $\mathcal{O}(N\log{N})$ complexity.
LGJul 13, 2023
Deep Network Approximation: Beyond ReLU to Diverse Activation FunctionsShijun Zhang, Jianfeng Lu, Hongkai Zhao
This paper explores the expressive power of deep neural networks for a diverse range of activation functions. An activation function set $\mathscr{A}$ is defined to encompass the majority of commonly used activation functions, such as $\mathtt{ReLU}$, $\mathtt{LeakyReLU}$, $\mathtt{ReLU}^2$, $\mathtt{ELU}$, $\mathtt{CELU}$, $\mathtt{SELU}$, $\mathtt{Softplus}$, $\mathtt{GELU}$, $\mathtt{SiLU}$, $\mathtt{Swish}$, $\mathtt{Mish}$, $\mathtt{Sigmoid}$, $\mathtt{Tanh}$, $\mathtt{Arctan}$, $\mathtt{Softsign}$, $\mathtt{dSiLU}$, and $\mathtt{SRS}$. We demonstrate that for any activation function $\varrho\in \mathscr{A}$, a $\mathtt{ReLU}$ network of width $N$ and depth $L$ can be approximated to arbitrary precision by a $\varrho$-activated network of width $3N$ and depth $2L$ on any bounded set. This finding enables the extension of most approximation results achieved with $\mathtt{ReLU}$ networks to a wide variety of other activation functions, albeit with slightly increased constants. Significantly, we establish that the (width,$\,$depth) scaling factors can be further reduced from $(3,2)$ to $(1,1)$ if $\varrho$ falls within a specific subset of $\mathscr{A}$. This subset includes activation functions such as $\mathtt{ELU}$, $\mathtt{CELU}$, $\mathtt{SELU}$, $\mathtt{Softplus}$, $\mathtt{GELU}$, $\mathtt{SiLU}$, $\mathtt{Swish}$, and $\mathtt{Mish}$.
COApr 17, 2017
Precomputing Strategy for Hamiltonian Monte Carlo Method Based on Regularity in Parameter SpaceCheng Zhang, Babak Shahbaba, Hongkai Zhao
Markov Chain Monte Carlo (MCMC) algorithms play an important role in statistical inference problems dealing with intractable probability distributions. Recently, many MCMC algorithms such as Hamiltonian Monte Carlo (HMC) and Riemannian Manifold HMC have been proposed to provide distant proposals with high acceptance rate. These algorithms, however, tend to be computationally intensive which could limit their usefulness, especially for big data problems due to repetitive evaluations of functions and statistical quantities that depend on the data. This issue occurs in many statistic computing problems. In this paper, we propose a novel strategy that exploits smoothness (regularity) of parameter space to improve computational efficiency of MCMC algorithms. When evaluation of functions or statistical quantities are needed at a point in parameter space, interpolation from precomputed values or previous computed values is used. More specifically, we focus on Hamiltonian Monte Carlo (HMC) algorithms that use geometric information for faster exploration of probability distributions. Our proposed method is based on precomputing the required geometric information on a set of grids before running sampling information at nearby grids at each iteration of HMC. Sparse grid interpolation method is used for high dimensional problems. Tests on computational examples are shown to illustrate the advantages of our method.
NANov 16, 2017
A hybrid approach to solve the high-frequency Helmholtz equation with source singularity in smooth heterogeneous mediaJun Fang, Jianliang Qian, Leonardo Zepeda-Núñez et al.
We propose a hybrid approach to solve the high-frequency Helmholtz equation with point source terms in smooth heterogeneous media. The method is based on the ray-based finite element method (ray-FEM), whose original version can not handle the singularity close to point sources accurately. This pitfall is addressed by combining the ray-FEM, which is used to compute the smooth far-field of the solution accurately, with a high-order asymptotic expansion close to the point source, which is used to properly capture the singularity of the solution in the near-field. The method requires a fixed number of grid points per wavelength to accurately represent the wave field with an asymptotic convergence rate of $\mathcal{O}(ω^{-1/2})$, where $ω$ is the frequency parameter in the Helmholtz equation. In addition, a fast sweeping-type preconditioner is used to solve the resulting linear system. We present numerical examples in 2D to show both accuracy and efficiency of our method as the frequency increases. In particular, we provide numerical evidence of the convergence rate, and we show empirically that the overall complexity is $\mathcal{O}(ω^2)$ up to a poly-logarithmic factor.
NAMar 7, 2018
A hybrid adaptive phase space method for reflection traveltime tomographyHongkai Zhao, Yimin Zhong
We present a hybrid imaging method for a challenging travel time tomography problem which includes both unknown medium and unknown scatterers in a bounded domain. The goal is to recover both the medium and the boundary of the scatterers from the scattering relation data on the domain boundary. Our method is composed of three steps: 1) preprocess the data to classify them into three different categories of measurements corresponding to non-broken rays, broken-once rays, and others, respectively, 2) use the the non-broken ray data and an effective data-driven layer stripping strategy--an optimization based iterative imaging method--to recover the medium velocity outside the convex hull of the scatterers, and 3) use selected broken-once ray data to recover the boundary of the scatterers--a direct imaging method. By numerical tests, we show that our hybrid method can recover both the unknown medium and the not-too-concave scatterers efficiently and robustly.
LGJun 29, 2023
Why Shallow Networks Struggle to Approximate and Learn High FrequenciesShijun Zhang, Hongkai Zhao, Yimin Zhong et al.
In this work, we present a comprehensive study combining mathematical and computational analysis to explain why a two-layer neural network struggles to handle high frequencies in both approximation and learning, especially when machine precision, numerical noise, and computational cost are significant factors in practice. Specifically, we investigate the following fundamental computational issues: (1) the minimal numerical error achievable under finite precision, (2) the computational cost required to attain a given accuracy, and (3) the stability of the method with respect to perturbations. The core of our analysis lies in the conditioning of the representation and its learning dynamics. Explicit answers to these questions are provided, along with supporting numerical evidence.
LGJan 29, 2023
On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU NetworkShijun Zhang, Jianfeng Lu, Hongkai Zhao
This paper explores the expressive power of deep neural networks through the framework of function compositions. We demonstrate that the repeated compositions of a single fixed-size ReLU network exhibit surprising expressive power, despite the limited expressive capabilities of the individual network itself. Specifically, we prove by construction that $\mathcal{L}_2\circ \boldsymbol{g}^{\circ r}\circ \boldsymbol{\mathcal{L}}_1$ can approximate $1$-Lipschitz continuous functions on $[0,1]^d$ with an error $\mathcal{O}(r^{-1/d})$, where $\boldsymbol{g}$ is realized by a fixed-size ReLU network, $\boldsymbol{\mathcal{L}}_1$ and $\mathcal{L}_2$ are two affine linear maps matching the dimensions, and $\boldsymbol{g}^{\circ r}$ denotes the $r$-times composition of $\boldsymbol{g}$. Furthermore, we extend such a result to generic continuous functions on $[0,1]^d$ with the approximation error characterized by the modulus of continuity. Our results reveal that a continuous-depth network generated via a dynamical system has immense approximation power even if its dynamics function is time-independent and realized by a fixed-size ReLU network.
LGFeb 26, 2025
Fourier Multi-Component and Multi-Layer Neural Networks: Unlocking High-Frequency PotentialShijun Zhang, Hongkai Zhao, Yimin Zhong et al.
The architecture of a neural network and the selection of its activation function are both fundamental to its performance. Equally vital is ensuring these two elements are well-matched, as their alignment is key to achieving effective representation and learning. In this paper, we introduce the Fourier Multi-Component and Multi-Layer Neural Network (FMMNN), a novel model that creates a strong synergy between them. We demonstrate that FMMNNs are highly effective and flexible in modeling high-frequency components. Our theoretical results demonstrate that FMMNNs have exponential expressive power for function approximation. We also analyze the optimization landscape of FMMNNs and find it to be much more favorable than that of standard fully connected neural networks, especially when dealing with high-frequency features. In addition, we propose a scaled random initialization method for the first layer's weights in FMMNNs, which significantly speeds up training and enhances overall performance. Extensive numerical experiments support our theoretical insights, showing that FMMNNs consistently outperform traditional approaches in accuracy and efficiency across various tasks.
LGJun 30, 2024
Structured and Balanced Multi-Component and Multi-Layer Neural NetworksShijun Zhang, Hongkai Zhao, Yimin Zhong et al.
In this work, we propose a balanced multi-component and multi-layer neural network (MMNN) structure to accurately and efficiently approximate functions with complex features, in terms of both degrees of freedom and computational cost. The main idea is inspired by a multi-component approach, in which each component can be effectively approximated by a single-layer network, combined with a multi-layer decomposition strategy to capture the complexity of the target function. Although MMNNs can be viewed as a simple modification of fully connected neural networks (FCNNs) or multi-layer perceptrons (MLPs) by introducing balanced multi-component structures, they achieve a significant reduction in training parameters, a much more efficient training process, and improved accuracy compared to FCNNs or MLPs. Extensive numerical experiments demonstrate the effectiveness of MMNNs in approximating highly oscillatory functions and their ability to automatically adapt to localized features.
CVJul 26, 2020
A Dual Iterative Refinement Method for Non-rigid Shape MatchingRui Xiang, Rongjie Lai, Hongkai Zhao
In this work, a simple and efficient dual iterative refinement (DIR) method is proposed for dense correspondence between two nearly isometric shapes. The key idea is to use dual information, such as spatial and spectral, or local and global features, in a complementary and effective way, and extract more accurate information from current iteration to use for the next iteration. In each DIR iteration, starting from current correspondence, a zoom-in process at each point is used to select well matched anchor pairs by a local mapping distortion criterion. These selected anchor pairs are then used to align spectral features (or other appropriate global features) whose dimension adaptively matches the capacity of the selected anchor pairs. Thanks to the effective combination of complementary information in a data-adaptive way, DIR is not only efficient but also robust to render accurate results within a few iterations. By choosing appropriate dual features, DIR has the flexibility to handle patch and partial matching as well. Extensive experiments on various data sets demonstrate the superiority of DIR over other state-of-the-art methods in terms of both accuracy and efficiency.
CVMar 19, 2020
Efficient and Robust Shape Correspondence via Sparsity-Enforced Quadratic AssignmentRui Xiang, Rongjie Lai, Hongkai Zhao
In this work, we introduce a novel local pairwise descriptor and then develop a simple, effective iterative method to solve the resulting quadratic assignment through sparsity control for shape correspondence between two approximate isometric surfaces. Our pairwise descriptor is based on the stiffness and mass matrix of finite element approximation of the Laplace-Beltrami differential operator, which is local in space, sparse to represent, and extremely easy to compute while containing global information. It allows us to deal with open surfaces, partial matching, and topological perturbations robustly. To solve the resulting quadratic assignment problem efficiently, the two key ideas of our iterative algorithm are: 1) select pairs with good (approximate) correspondence as anchor points, 2) solve a regularized quadratic assignment problem only in the neighborhood of selected anchor points through sparsity control. These two ingredients can improve and increase the number of anchor points quickly while reducing the computation cost in each quadratic assignment iteration significantly. With enough high-quality anchor points, one may use various pointwise global features with reference to these anchor points to further improve the dense shape correspondence. We use various experiments to show the efficiency, quality, and versatility of our method on large data sets, patches, and point clouds (without global meshes).
NAJul 1, 2019
A data-driven approach for multiscale elliptic PDEs with random coefficients based on intrinsic dimension reductionSijing Li, Zhiwen Zhang, Hongkai Zhao
We propose a data-driven approach to solve multiscale elliptic PDEs with random coefficients based on the intrinsic low dimension structure of the underlying elliptic differential operators. Our method consists of offline and online stages. At the offline stage, a low dimension space and its basis are extracted from the data to achieve significant dimension reduction in the solution space. At the online stage, the extracted basis will be used to solve a new multiscale elliptic PDE efficiently. The existence of low dimension structure is established by showing the high separability of the underlying Green's functions. Different online construction methods are proposed depending on the problem setup. We provide error analysis based on the sampling error and the truncation threshold in building the data-driven basis. Finally, we present numerical examples to demonstrate the accuracy and efficiency of the proposed method.
NAAug 31, 2016
Learning Dominant Wave Directions For Plane Wave Methods For High-Frequency Helmholtz EquationsJun Fang, Jianliang Qian, Leonardo Zepeda-Núñez et al.
We present a ray-based finite element method (ray-FEM) by learning basis adaptive to the underlying high-frequency Helmholtz equation in smooth media. Based on the geometric optics ansatz of the wave field, we learn local dominant ray directions by probing the medium using low-frequency waves with the same source. Once local ray directions are extracted, they are incorporated into the finite element basis to solve the high-frequency Helmholtz equation. This process can be continued to further improve approximations for both local ray directions and the high frequency wave field iteratively. The method requires a fixed number of grid points per wavelength to represent the wave field and achieves an asymptotic convergence as the frequency $ω\rightarrow \infty$ without the pollution effect. A fast solver is developed for the resulting linear system with an empirical complexity $\mathcal{O}(ω^d)$ up to a poly-logarithmic factor. Numerical examples in 2D are presented to corroborate the claims.
COFeb 6, 2016
Variational Hamiltonian Monte Carlo via Score MatchingCheng Zhang, Babak Shahbaba, Hongkai Zhao
Traditionally, the field of computational Bayesian statistics has been divided into two main subfields: variational methods and Markov chain Monte Carlo (MCMC). In recent years, however, several methods have been proposed based on combining variational Bayesian inference and MCMC simulation in order to improve their overall accuracy and computational efficiency. This marriage of fast evaluation and flexible approximation provides a promising means of designing scalable Bayesian inference methods. In this paper, we explore the possibility of incorporating variational approximation into a state-of-the-art MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required gradient computation in the simulation of Hamiltonian flow, which is the bottleneck for many applications of HMC in big data problems. To this end, we use a {\it free-form} approximation induced by a fast and flexible surrogate function based on single-hidden layer feedforward neural networks. The surrogate provides sufficiently accurate approximation while allowing for fast exploration of parameter space, resulting in an efficient approximate inference algorithm. We demonstrate the advantages of our method on both synthetic and real data problems.
COJun 18, 2015
Hamiltonian Monte Carlo Acceleration Using Surrogate Functions with Random BasesCheng Zhang, Babak Shahbaba, Hongkai Zhao
For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov Chain Monte Carlo (MCMC) methods, namely, Hamiltonian Monte Carlo (HMC). The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the art methods.