Dongbin Xiu

LG
h-index4
29papers
915citations
Novelty50%
AI Score39

29 Papers

NAJan 16, 2016
A stochastic Galerkin method for general system of quasilinear hyperbolic conservation laws with uncertainty

Kailiang Wu, Huazhong Tang, Dongbin Xiu

This paper is concerned with generalized polynomial chaos (gPC) approximation for a general system of quasilinear hyperbolic conservation laws with uncertainty. The one-dimensional (1D) hyperbolic system is first symmetrized with the aid of left eigenvector matrix of the Jacobian matrix. Stochastic Galerkin method is then applied to derive the equations for the gPC expansion coefficients. The resulting deterministic gPC Galerkin system is proved to be symmetrically hyperbolic. This important property then allows one to use a variety of numerical schemes for spatial and temporal discretization. Here a higher-order and path-conservative finite volume WENO scheme is adopted in space, along with a third-order total variation diminishing Runge-Kutta method in time. The method is further extended to two-dimensional (2D) quasilinear hyperbolic system with uncertainty, where the symmetric hyperbolicity of the one-dimensional system is carried over via the operator splitting technique. Several 1D and 2D numerical experiments are conducted to demonstrate the accuracy and effectiveness of the proposed gPC stochastic Galerkin method.

LGJul 20, 2023
Flow Map Learning for Unknown Dynamical Systems: Overview, Implementation, and Benchmarks

Victor Churchill, Dongbin Xiu

Flow map learning (FML), in conjunction with deep neural networks (DNNs), has shown promises for data driven modeling of unknown dynamical systems. A remarkable feature of FML is that it is capable of producing accurate predictive models for partially observed systems, even when their exact mathematical models do not exist. In this paper, we present an overview of the FML framework, along with the important computational details for its successful implementation. We also present a set of well defined benchmark problems for learning unknown dynamical systems. All the numerical details of these problems are presented, along with their FML results, to ensure that the problems are accessible for cross-examination and the results are reproducible.

LGMay 12, 2022
Deep Learning of Chaotic Systems from Partially-Observed Data

Victor Churchill, Dongbin Xiu

Recently, a general data driven numerical framework has been developed for learning and modeling of unknown dynamical systems using fully- or partially-observed data. The method utilizes deep neural networks (DNNs) to construct a model for the flow map of the unknown system. Once an accurate DNN approximation of the flow map is constructed, it can be recursively executed to serve as an effective predictive model of the unknown system. In this paper, we apply this framework to chaotic systems, in particular the well-known Lorenz 63 and 96 systems, and critically examine the predictive performance of the approach. A distinct feature of chaotic systems is that even the smallest perturbations will lead to large (albeit bounded) deviations in the solution trajectories. This makes long-term predictions of the method, or any data driven methods, questionable, as the local model accuracy will eventually degrade and lead to large pointwise errors. Here we employ several other qualitative and quantitative measures to determine whether the chaotic dynamics have been learned. These include phase plots, histograms, autocorrelation, correlation dimension, approximate entropy, and Lyapunov exponent. Using these measures, we demonstrate that the flow map based DNN learning method is capable of accurately modeling chaotic systems, even when only a subset of the state variables are available to the DNNs. For example, for the Lorenz 96 system with 40 state variables, when data of only 3 variables are available, the method is able to learn an effective DNN model for the 3 variables and produce accurately the chaotic behavior of the system.

LGMar 7, 2022
Robust Modeling of Unknown Dynamical Systems via Ensemble Averaged Learning

Victor Churchill, Steve Manns, Zhen Chen et al.

Recent work has focused on data-driven learning of the evolution of unknown systems via deep neural networks (DNNs), with the goal of conducting long time prediction of the evolution of the unknown system. Training a DNN with low generalization error is a particularly important task in this case as error is accumulated over time. Because of the inherent randomness in DNN training, chiefly in stochastic optimization, there is uncertainty in the resulting prediction, and therefore in the generalization error. Hence, the generalization error can be viewed as a random variable with some probability distribution. Well-trained DNNs, particularly those with many hyperparameters, typically result in probability distributions for generalization error with low bias but high variance. High variance causes variability and unpredictably in the results of a trained DNN. This paper presents a computational technique which decreases the variance of the generalization error, thereby improving the reliability of the DNN model to generalize consistently. In the proposed ensemble averaging method, multiple models are independently trained and model predictions are averaged at each time step. A mathematical foundation for the method is presented, including results regarding the distribution of the local truncation error. In addition, three time-dependent differential equation problems are considered as numerical examples, demonstrating the effectiveness of the method to decrease variance of DNN predictions generally.

NANov 13, 2018
Energy Conserving Galerkin Approximation of Two Dimensional Wave Equations with Random Coefficients

Ching-Shan Chou, Yukun Li, Dongbin Xiu

Wave propagation problems for heterogeneous media are known to have many applications in physics and engineering. Recently, there has been an increasing interest in stochastic effects due to the uncertainty, which may arise from impurities of the media. This work considers a two-dimensional wave equation with random coefficients which may be discontinuous in space. Generalized polynomial chaos method is used in conjunction with stochastic Galerkin approximation, and local discontinuous Galerkin method is used for spatial discretization. Our method is shown to be energy preserving in semi-discrete form as well as in fully discrete form, when leap-frog time discretization is used. Its convergence rate is proved to be optimal and the error grows linearly in time. The theoretical properties of the proposed scheme are validated by numerical tests.

LGJun 3, 2022
Learning Fine Scale Dynamics from Coarse Observations via Inner Recurrence

Victor Churchill, Dongbin Xiu

Recent work has focused on data-driven learning of the evolution of unknown systems via deep neural networks (DNNs), with the goal of conducting long term prediction of the dynamics of the unknown system. In many real-world applications, data from time-dependent systems are often collected on a time scale that is coarser than desired, due to various restrictions during the data acquisition process. Consequently, the observed dynamics can be severely under-sampled and do not reflect the true dynamics of the underlying system. This paper presents a computational technique to learn the fine-scale dynamics from such coarsely observed data. The method employs inner recurrence of a DNN to recover the fine-scale evolution operator of the underlying system. In addition to mathematical justification, several challenging numerical examples, including unknown systems of both ordinary and partial differential equations, are presented to demonstrate the effectiveness of the proposed method.

LGApr 14, 2025Code
DUE: A Deep Learning Framework and Library for Modeling Unknown Equations

Junfeng Chen, Kailiang Wu, Dongbin Xiu

Equations, particularly differential equations, are fundamental for understanding natural phenomena and predicting complex dynamics across various scientific and engineering disciplines. However, the governing equations for many complex systems remain unknown due to intricate underlying mechanisms. Recent advancements in machine learning and data science offer a new paradigm for modeling unknown equations from measurement or simulation data. This paradigm shift, known as data-driven discovery or modeling, stands at the forefront of AI for science, with significant progress made in recent years. In this paper, we introduce a systematic framework for data-driven modeling of unknown equations using deep learning. This versatile framework is capable of learning unknown ODEs, PDEs, DAEs, IDEs, SDEs, reduced or partially observed systems, and non-autonomous differential equations. Based on this framework, we have developed Deep Unknown Equations (DUE), an open-source software package designed to facilitate the data-driven modeling of unknown equations using modern deep learning techniques. DUE serves as an educational tool for classroom instruction, enabling students and newcomers to gain hands-on experience with differential equations, data-driven modeling, and contemporary deep learning approaches such as FNN, ResNet, generalized ResNet, operator semigroup networks (OSG-Net), and Transformers. Additionally, DUE is a versatile and accessible toolkit for researchers across various scientific and engineering fields. It is applicable not only for learning unknown equations from data but also for surrogate modeling of known, yet complex, equations that are costly to solve using traditional numerical methods. We provide detailed descriptions of DUE and demonstrate its capabilities through diverse examples, which serve as templates that can be easily adapted for other applications.

LGAug 27, 2024
Data-driven Effective Modeling of Multiscale Stochastic Dynamical Systems

Yuan Chen, Dongbin Xiu

We present a numerical method for learning the dynamics of slow components of unknown multiscale stochastic dynamical systems. While the governing equations of the systems are unknown, bursts of observation data of the slow variables are available. By utilizing the observation data, our proposed method is capable of constructing a generative stochastic model that can accurately capture the effective dynamics of the slow variables in distribution. We present a comprehensive set of numerical examples to demonstrate the performance of the proposed method.

LGSep 27, 2024
Chebyshev Feature Neural Network for Accurate Function Approximation

Zhongshu Xu, Yuan Chen, Dongbin Xiu

We present a new Deep Neural Network (DNN) architecture capable of approximating functions up to machine accuracy. Termed Chebyshev Feature Neural Network (CFNN), the new structure employs Chebyshev functions with learnable frequencies as the first hidden layer, followed by the standard fully connected hidden layers. The learnable frequencies of the Chebyshev layer are initialized with exponential distributions to cover a wide range of frequencies. Combined with a multi-stage training strategy, we demonstrate that this CFNN structure can achieve machine accuracy during training. A comprehensive set of numerical examples for dimensions up to $20$ are provided to demonstrate the effectiveness and scalability of the method.

LGDec 15, 2023
Modeling Unknown Stochastic Dynamical System via Autoencoder

Zhongshu Xu, Yuan Chen, Qifan Chen et al.

We present a numerical method to learn an accurate predictive model for an unknown stochastic dynamical system from its trajectory data. The method seeks to approximate the unknown flow map of the underlying system. It employs the idea of autoencoder to identify the unobserved latent random variables. In our approach, we design an encoding function to discover the latent variables, which are modeled as unit Gaussian, and a decoding function to reconstruct the future states of the system. Both the encoder and decoder are expressed as deep neural networks (DNNs). Once the DNNs are trained by the trajectory data, the decoder serves as a predictive model for the unknown stochastic system. Through an extensive set of numerical examples, we demonstrate that the method is able to produce long-term system predictions by using short bursts of trajectory data. It is also applicable to systems driven by non-Gaussian noises.

LGApr 2, 2025
Multi-fidelity Parameter Estimation Using Conditional Diffusion Models

Caroline Tatsuoka, Minglei Yang, Dongbin Xiu et al.

We present a multi-fidelity method for uncertainty quantification of parameter estimates in complex systems, leveraging generative models trained to sample the target conditional distribution. In the Bayesian inference setting, traditional parameter estimation methods rely on repeated simulations of potentially expensive forward models to determine the posterior distribution of the parameter values, which may result in computationally intractable workflows. Furthermore, methods such as Markov Chain Monte Carlo (MCMC) necessitate rerunning the entire algorithm for each new data observation, further increasing the computational burden. Hence, we propose a novel method for efficiently obtaining posterior distributions of parameter estimates for high-fidelity models given data observations of interest. The method first constructs a low-fidelity, conditional generative model capable of amortized Bayesian inference and hence rapid posterior density approximation over a wide-range of data observations. When higher accuracy is needed for a specific data observation, the method employs adaptive refinement of the density approximation. It uses outputs from the low-fidelity generative model to refine the parameter sampling space, ensuring efficient use of the computationally expensive high-fidelity solver. Subsequently, a high-fidelity, unconditional generative model is trained to achieve greater accuracy in the target posterior distribution. Both low- and high- fidelity generative models enable efficient sampling from the target posterior and do not require repeated simulation of the high-fidelity forward model. We demonstrate the effectiveness of the proposed method on several numerical examples, including cases with multi-modal densities, as well as an application in plasma physics for a runaway electron simulation model.

LGOct 23, 2024
Deep learning for model correction of dynamical systems with data scarcity

Caroline Tatsuoka, Dongbin Xiu

We present a deep learning framework for correcting existing dynamical system models utilizing only a scarce high-fidelity data set. In many practical situations, one has a low-fidelity model that can capture the dynamics reasonably well but lacks high resolution, due to the inherent limitation of the model and the complexity of the underlying physics. When high resolution data become available, it is natural to seek model correction to improve the resolution of the model predictions. We focus on the case when the amount of high-fidelity data is so small that most of the existing data driven modeling methods cannot be applied. In this paper, we address these challenges with a model-correction method which only requires a scarce high-fidelity data set. Our method first seeks a deep neural network (DNN) model to approximate the existing low-fidelity model. By using the scarce high-fidelity data, the method then corrects the DNN model via transfer learning (TL). After TL, an improved DNN model with high prediction accuracy to the underlying dynamics is obtained. One distinct feature of the propose method is that it does not assume a specific form of the model correction terms. Instead, it offers an inherent correction to the low-fidelity model via TL. A set of numerical examples are presented to demonstrate the effectiveness of the proposed method.

LGOct 8, 2025
Targeted Digital Twin via Flow Map Learning and Its Application to Fluid Dynamics

Qifan Chen, Zhongshu Xu, Jinjin Zhang et al.

We present a numerical framework for constructing a targeted digital twin (tDT) that directly models the dynamics of quantities of interest (QoIs) in a full digital twin (DT). The proposed approach employs memory-based flow map learning (FML) to develop a data-driven model of the QoIs using short bursts of trajectory data generated through repeated executions of the full DT. This renders the construction of the FML-based tDT an entirely offline computational process. During online simulation, the learned tDT can efficiently predict and analyze the long-term dynamics of the QoIs without requiring simulations of the full DT system, thereby achieving substantial computational savings. After introducing the general numerical procedure, we demonstrate the construction and predictive capability of the tDT in a computational fluid dynamics (CFD) example: two-dimensional incompressible flow past a cylinder. The QoIs in this problem are the hydrodynamic forces exerted on the cylinder. The resulting tDTs are compact dynamical systems that evolve these forces without explicit knowledge of the underlying flow field. Numerical results show that the tDTs yield accurate long-term predictions of the forces while entirely bypassing full flow simulations.

LGJun 22, 2024
Modeling Unknown Stochastic Dynamical System Subject to External Excitation

Yuan Chen, Dongbin Xiu

We present a numerical method for learning unknown nonautonomous stochastic dynamical system, i.e., stochastic system subject to time dependent excitation or control signals. Our basic assumption is that the governing equations for the stochastic system are unavailable. However, short bursts of input/output (I/O) data consisting of certain known excitation signals and their corresponding system responses are available. When a sufficient amount of such I/O data are available, our method is capable of learning the unknown dynamics and producing an accurate predictive model for the stochastic responses of the system subject to arbitrary excitation signals not in the training data. Our method has two key components: (1) a local approximation of the training I/O data to transfer the learning into a parameterized form; and (2) a generative model to approximate the underlying unknown stochastic flow map in distribution. After presenting the method in detail, we present a comprehensive set of numerical examples to demonstrate the performance of the proposed method, especially for long-term system predictions.

LGMay 5, 2023
Learning Stochastic Dynamical System via Flow Map Operator

Yuan Chen, Dongbin Xiu

We present a numerical framework for learning unknown stochastic dynamical systems using measurement data. Termed stochastic flow map learning (sFML), the new framework is an extension of flow map learning (FML) that was developed for learning deterministic dynamical systems. For learning stochastic systems, we define a stochastic flow map that is a superposition of two sub-flow maps: a deterministic sub-map and a stochastic sub-map. The stochastic training data are used to construct the deterministic sub-map first, followed by the stochastic sub-map. The deterministic sub-map takes the form of residual network (ResNet), similar to the work of FML for deterministic systems. For the stochastic sub-map, we employ a generative model, particularly generative adversarial networks (GANs) in this paper. The final constructed stochastic flow map then defines a stochastic evolution model that is a weak approximation, in term of distribution, of the unknown stochastic system. A comprehensive set of numerical examples are presented to demonstrate the flexibility and effectiveness of the proposed sFML method for various types of stochastic systems.

MLFeb 3, 2022
Modeling unknown dynamical systems with hidden parameters

Xiaohan Fu, Weize Mao, Lo-Bin Chang et al.

We present a data-driven numerical approach for modeling unknown dynamical systems with missing/hidden parameters. The method is based on training a deep neural network (DNN) model for the unknown system using its trajectory data. A key feature is that the unknown dynamical system contains system parameters that are completely hidden, in the sense that no information about the parameters is available through either the measurement trajectory data or our prior knowledge of the system. We demonstrate that by training a DNN using the trajectory data with sufficient time history, the resulting DNN model can accurately model the unknown dynamical system. For new initial conditions associated with new, and unknown, system parameters, the DNN model can produce accurate system predictions over longer time.

LGJun 7, 2021
Deep Neural Network Modeling of Unknown Partial Differential Equations in Nodal Space

Zhen Chen, Victor Churchill, Kailiang Wu et al.

We present a numerical framework for deep neural network (DNN) modeling of unknown time-dependent partial differential equations (PDE) using their trajectory data. Unlike the recent work of [Wu and Xiu, J. Comput. Phys. 2020], where the learning takes place in modal/Fourier space, the current method conducts the learning and modeling in physical space and uses measurement data as nodal values. We present a DNN structure that has a direct correspondence to the evolution operator of the underlying PDE, thus establishing the existence of the DNN model. The DNN model also does not require any geometric information of the data nodes. Consequently, a trained DNN defines a predictive model for the underlying unknown PDE over structureless grids. A set of examples, including linear and nonlinear scalar PDE, system of PDEs, in both one dimension and two dimensions, over structured and unstructured grids, are presented to demonstrate the effectiveness of the proposed DNN modeling. Extension to other equations such as differential-integral equations is also discussed.

SPJun 2, 2020
Data-driven learning of non-autonomous systems

Tong Qin, Zhen Chen, John Jakeman et al.

We present a numerical framework for recovering unknown non-autonomous dynamical systems with time-dependent inputs. To circumvent the difficulty presented by the non-autonomous nature of the system, our method transforms the solution state into piecewise integration of the system over a discrete set of time instances. The time-dependent inputs are then locally parameterized by using a proper model, for example, polynomial regression, in the pieces determined by the time instances. This transforms the original system into a piecewise parametric system that is locally time invariant. We then design a deep neural network structure to learn the local models. Once the network model is constructed, it can be iteratively used over time to conduct global system prediction. We provide theoretical analysis of our algorithm and present a number of numerical examples to demonstrate the effectiveness of the method.

MLMar 20, 2020
Learning reduced systems via deep neural networks with memory

Xiaohan Fu, Lo-Bin Chang, Dongbin Xiu

We present a general numerical approach for constructing governing equations for unknown dynamical systems when only data on a subset of the state variables are available. The unknown equations for these observed variables are thus a reduced system of the complete set of state variables. Reduced systems possess memory integrals, based on the well known Mori-Zwanzig (MZ) formulism. Our numerical strategy to recover the reduced system starts by formulating a discrete approximation of the memory integral in the MZ formulation. The resulting unknown approximate MZ equations are of finite dimensional, in the sense that a finite number of past history data are involved. We then present a deep neural network structure that directly incorporates the history terms to produce memory in the network. The approach is suitable for any practical systems with finite memory length. We then use a set of numerical examples to demonstrate the effectiveness of our method.

NAMar 5, 2020
Methods to Recover Unknown Processes in Partial Differential Equations Using Data

Zhen Chen, Kailiang Wu, Dongbin Xiu

We study the problem of identifying unknown processes embedded in time-dependent partial differential equation (PDE) using observational data, with an application to advection-diffusion type PDE. We first conduct theoretical analysis and derive conditions to ensure the solvability of the problem. We then present a set of numerical approaches, including Galerkin type algorithm and collocation type algorithm. Analysis of the algorithms are presented, along with their implementation detail. The Galerkin algorithm is more suitable for practical situations, particularly those with noisy data, as it avoids using derivative/gradient data. Various numerical examples are then presented to demonstrate the performance and properties of the numerical methods.

LGFeb 11, 2020
A Non-Intrusive Correction Algorithm for Classification Problems with Corrupted Data

Jun Hou, Tong Qin, Kailiang Wu et al.

A novel correction algorithm is proposed for multi-class classification problems with corrupted training data. The algorithm is non-intrusive, in the sense that it post-processes a trained classification model by adding a correction procedure to the model prediction. The correction procedure can be coupled with any approximators, such as logistic regression, neural networks of various architectures, etc. When training dataset is sufficiently large, we prove that the corrected models deliver correct classification results as if there is no corruption in the training data. For datasets of finite size, the corrected models produce significantly better recovery results, compared to the models without the correction algorithm. All of the theoretical findings in the paper are verified by our numerical examples.

LGJan 23, 2020
On generalized residue network for deep learning of unknown dynamical systems

Zhen Chen, Dongbin Xiu

We present a general numerical approach for learning unknown dynamical systems using deep neural networks (DNNs). Our method is built upon recent studies that identified the residue network (ResNet) as an effective neural network structure. In this paper, we present a generalized ResNet framework and broadly define residue as the discrepancy between observation data and prediction made by another model, which can be an existing coarse model or reduced-order model. In this case, the generalized ResNet serves as a model correction to the existing model and recovers the unresolved dynamics. When an existing coarse model is not available, we present numerical strategies for fast creation of coarse models, to be used in conjunction with the generalized ResNet. These coarse models are constructed using the same data set and thus do not require additional resources. The generalized ResNet is capable of learning the underlying unknown equations and producing predictions with accuracy higher than the standard ResNet structure. This is demonstrated via several numerical examples, including long-term prediction of a chaotic system.

NAOct 15, 2019
Data-Driven Deep Learning of Partial Differential Equations in Modal Space

Kailiang Wu, Dongbin Xiu

We present a framework for recovering/approximating unknown time-dependent partial differential equation (PDE) using its solution data. Instead of identifying the terms in the underlying PDE, we seek to approximate the evolution operator of the underlying PDE numerically. The evolution operator of the PDE, defined in infinite-dimensional space, maps the solution from a current time to a future time and completely characterizes the solution evolution of the underlying unknown PDE. Our recovery strategy relies on approximation of the evolution operator in a properly defined modal space, i.e., generalized Fourier space, in order to reduce the problem to finite dimensions. The finite dimensional approximation is then accomplished by training a deep neural network structure, which is based on residual network (ResNet), using the given data. Error analysis is provided to illustrate the predictive accuracy of the proposed method. A set of examples of different types of PDEs, including inviscid Burgers' equation that develops discontinuity in its solution, are presented to demonstrate the effectiveness of the proposed method.

NAMay 24, 2019
Structure-preserving Method for Reconstructing Unknown Hamiltonian Systems from Trajectory Data

Kailiang Wu, Tong Qin, Dongbin Xiu

We present a numerical approach for approximating unknown Hamiltonian systems using observation data. A distinct feature of the proposed method is that it is structure-preserving, in the sense that it enforces conservation of the reconstructed Hamiltonian. This is achieved by directly approximating the underlying unknown Hamiltonian, rather than the right-hand-side of the governing equations. We present the technical details of the proposed algorithm and its error estimate in a special case, along with a practical de-noising procedure to cope with noisy data. A set of numerical examples are then presented to demonstrate the structure-preserving property and effectiveness of the algorithm.

NANov 13, 2018
Data Driven Governing Equations Approximation Using Deep Neural Networks

Tong Qin, Kailiang Wu, Dongbin Xiu

We present a numerical framework for approximating unknown governing equations using observation data and deep neural networks (DNN). In particular, we propose to use residual network (ResNet) as the basic building block for equation approximation. We demonstrate that the ResNet block can be considered as a one-step method that is exact in temporal integration. We then present two multi-step methods, recurrent ResNet (RT-ResNet) method and recursive ReNet (RS-ResNet) method. The RT-ResNet is a multi-step method on uniform time steps, whereas the RS-ResNet is an adaptive multi-step method using variable time steps. All three methods presented here are based on integral form of the underlying dynamical system. As a result, they do not require time derivative data for equation recovery and can cope with relatively coarsely distributed trajectory data. Several numerical examples are presented to demonstrate the performance of the methods.

NASep 24, 2018
Numerical Aspects for Approximating Governing Equations Using Data

Kailiang Wu, Dongbin Xiu

We present effective numerical algorithms for locally recovering unknown governing differential equations from measurement data. We employ a set of standard basis functions, e.g., polynomials, to approximate the governing equation with high accuracy. Upon recasting the problem into a function approximation problem, we discuss several important aspects for accurate approximation. Most notably, we discuss the importance of using a large number of short bursts of trajectory data, rather than using data from a single long trajectory. Several options for the numerical algorithms to perform accurate approximation are then presented, along with an error estimate of the final equation approximation. We then present an extensive set of numerical examples of both linear and nonlinear systems to demonstrate the properties and effectiveness of our equation recovery algorithms.

NAAug 22, 2018
An Explicit Neural Network Construction for Piecewise Constant Function Approximation

Kailiang Wu, Dongbin Xiu

We present an explicit construction for feedforward neural network (FNN), which provides a piecewise constant approximation for multivariate functions. The proposed FNN has two hidden layers, where the weights and thresholds are explicitly defined and do not require numerical optimization for training. Unlike most of the existing work on explicit FNN construction, the proposed FNN does not rely on tensor structure in multiple dimensions. Instead, it automatically creates Voronoi tessellation of the domain, based on the given data of the target function, and piecewise constant approximation of the function. This makes the construction more practical for applications. We present both theoretical analysis and numerical examples to demonstrate its properties.

MLMay 22, 2018
Reducing Parameter Space for Neural Network Training

Tong Qin, Ling Zhou, Dongbin Xiu

For neural networks (NNs) with rectified linear unit (ReLU) or binary activation functions, we show that their training can be accomplished in a reduced parameter space. Specifically, the weights in each neuron can be trained on the unit sphere, as opposed to the entire space, and the threshold can be trained in a bounded interval, as opposed to the real line. We show that the NNs in the reduced parameter space are mathematically equivalent to the standard NNs with parameters in the whole space. The reduced parameter space shall facilitate the optimization procedure for the network training, as the search space becomes (much) smaller. We demonstrate the improved training performance using numerical examples.

NAApr 13, 2005
Equation-free, multiscale computation for unsteady random diffusion

Dongbin Xiu, Ioannis Kevrekidis

We present an ``equation-free'' multiscale approach to the simulation of unsteady diffusion in a random medium. The diffusivity of the medium is modeled as a random field with short correlation length, and the governing equations are cast in the form of stochastic differential equations. A detailed fine-scale computation of such a problem requires discretization and solution of a large system of equations, and can be prohibitively time-consuming. To circumvent this difficulty, we propose an equation-free approach, where the fine-scale computation is conducted only for a (small) fraction of the overall time. The evolution of a set of appropriately defined coarse-grained variables (observables) is evaluated during the fine-scale computation, and ``projective integration'' is used to accelerate the integration. The choice of these coarse variables is an important part of the approach: they are the coefficients of pointwise polynomial expansions of the random solutions. Such a choice of coarse variables allows us to reconstruct representative ensembles of fine-scale solutions with "correct" correlation structures, which is a key to algorithm efficiency. Numerical examples demonstrating accuracy and efficiency of the approach are presented.