Manabu Ihara

ML
h-index72
7papers
94citations
Novelty36%
AI Score35

7 Papers

MLJan 13, 2023
Neural network with optimal neuron activation functions based on additive Gaussian process regression

Sergei Manzhos, Manabu Ihara

Feed-forward neural networks (NN) are a staple machine learning method widely used in many areas of science and technology. While even a single-hidden layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow using fewer neurons and layers and thereby save computational cost and improve expressive power. We show that additive Gaussian process regression (GPR) can be used to construct optimal neuron activation functions that are individual to each neuron. An approach is also introduced that avoids non-linear fitting of neural network parameters. The resulting method combines the advantage of robustness of a linear regression with the higher expressive power of a NN. We demonstrate the approach by fitting the potential energy surfaces of the water molecule and formaldehyde. Without requiring any non-linear optimization, the additive GPR based approach outperforms a conventional NN in the high accuracy regime, where a conventional NN suffers more from overfitting.

COMP-PHNov 17, 2023
Degeneration of kernel regression with Matern kernels into low-order polynomial regression in high dimension

Sergei Manzhos, Manabu Ihara

Kernel methods such as kernel ridge regression and Gaussian process regressions with Matern type kernels have been increasingly used, in particular, to fit potential energy surfaces (PES) and density functionals, and for materials informatics. When the dimensionality of the feature space is high, these methods are used with necessarily sparse data. In this regime, the optimal length parameter of a Matern-type kernel tends to become so large that the method effectively degenerates into a low-order polynomial regression and therefore loses any advantage over such regression. This is demonstrated theoretically as well as numerically on the examples of six- and fifteen-dimensional molecular PES using squared exponential and simple exponential kernels. The results shed additional light on the success of polynomial approximations such as PIP for medium size molecules and on the importance of orders-of-coupling based models for preserving the advantages of kernel methods with Matern type kernels or on the use of physically-motivated (reproducing) kernels.

LGFeb 11, 2023
Orders-of-coupling representation with a single neural network with optimal neuron activation functions and without nonlinear parameter optimization

Sergei Manzhos, Manabu Ihara

Representations of multivariate functions with low-dimensional functions that depend on subsets of original coordinates (corresponding of different orders of coupling) are useful in quantum dynamics and other applications, especially where integration is needed. Such representations can be conveniently built with machine learning methods, and previously, methods building the lower-dimensional terms of such representations with neural networks [e.g. Comput. Phys. Comm. 180 (2009) 2002] and Gaussian process regressions [e.g. Mach. Learn. Sci. Technol. 3 (2022) 01LT02] were proposed. Here, we show that neural network models of orders-of-coupling representations can be easily built by using a recently proposed neural network with optimal neuron activation functions computed with a first-order additive Gaussian process regression [arXiv:2301.05567] and avoiding non-linear parameter optimization. Examples are given of representations of molecular potential energy surfaces.

MLNov 21, 2022
The loss of the property of locality of the kernel in high-dimensional Gaussian process regression on the example of the fitting of molecular potential energy surfaces

Sergei Manzhos, Manabu Ihara

Kernel based methods including Gaussian process regression (GPR) and generally kernel ridge regression (KRR) have been finding increasing use in computational chemistry, including the fitting of potential energy surfaces and density functionals in high-dimensional feature spaces. Kernels of the Matern family such as Gaussian-like kernels (basis functions) are often used, which allows imparting them the meaning of covariance functions and formulating GPR as an estimator of the mean of a Gaussian distribution. The notion of locality of the kernel is critical for this interpretation. It is also critical to the formulation of multi-zeta type basis functions widely used in computational chemistry We show, on the example of fitting of molecular potential energy surfaces of increasing dimensionality, the practical disappearance of the property of locality of a Gaussian-like kernel in high dimensionality. We also formulate a multi-zeta approach to the kernel and show that it significantly improves the quality of regression in low dimensionality but loses any advantage in high dimensionality, which is attributed to the loss of the property of locality.

NEOct 13, 2025
Neural networks for neurocomputing circuits: a computational study of tolerance to noise and activation function non-uniformity when machine learning materials properties

Ye min Thant, Methawee Nukunudompanich, Chu-Chen Chueh et al.

Dedicated analog neurocomputing circuits are promising for high-throughput, low power consumption applications of machine learning (ML) and for applications where implementing a digital computer is unwieldy (remote locations; small, mobile, and autonomous devices, extreme conditions, etc.). Neural networks (NN) implemented in such circuits, however, must contend with circuit noise and the non-uniform shapes of the neuron activation function (NAF) due to the dispersion of performance characteristics of circuit elements (such as transistors or diodes implementing the neurons). We present a computational study of the impact of circuit noise and NAF inhomogeneity in function of NN architecture and training regimes. We focus on one application that requires high-throughput ML: materials informatics, using as representative problem ML of formation energies vs. lowest-energy isomer of peri-condensed hydrocarbons, formation energies and band gaps of double perovskites, and zero point vibrational energies of molecules from QM9 dataset. We show that NNs generally possess low noise tolerance with the model accuracy rapidly degrading with noise level. Single-hidden layer NNs, and NNs with larger-than-optimal sizes are somewhat more noise-tolerant. Models that show less overfitting (not necessarily the lowest test set error) are more noise-tolerant. Importantly, we demonstrate that the effect of activation function inhomogeneity can be palliated by retraining the NN using practically realized shapes of NAFs.

MLSep 10, 2025
Gaussian Process Regression -- Neural Network Hybrid with Optimized Redundant Coordinates

Sergei Manzhos, Manabu Ihara

Recently, a Gaussian Process Regression - neural network (GPRNN) hybrid machine learning method was proposed, which is based on additive-kernel GPR in redundant coordinates constructed by rules [J. Phys. Chem. A 127 (2023) 7823]. The method combined the expressive power of an NN with the robustness of linear regression, in particular, with respect to overfitting when the number of neurons is increased beyond optimal. We introduce opt-GPRNN, in which the redundant coordinates of GPRNN are optimized with a Monte Carlo algorithm and show that when combined with optimization of redundant coordinates, GPRNN attains the lowest test set error with much fewer terms / neurons and retains the advantage of avoiding overfitting when the number of neurons is increased beyond optimal value. The method, opt-GPRNN possesses an expressive power closer to that of a multilayer NN and could obviate the need for deep NNs in some applications. With optimized redundant coordinates, a dimensionality reduction regime is also possible. Examples of application to machine learning an interatomic potential and materials informatics are given.

CONov 24, 2020
Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method

Owen Ren, Mohamed Ali Boussaidi, Dmitry Voytsekhovsky et al.

We present a Python implementation for RS-HDMR-GPR (Random Sampling High Dimensional Model Representation Gaussian Process Regression). The method builds representations of multivariate functions with lower-dimensional terms, either as an expansion over orders of coupling or using terms of only a given dimensionality. This facilitates, in particular, recovering functional dependence from sparse data. The code also allows for imputation of missing values of the variables and for a significant pruning of the useful number of HDMR terms. The code can also be used for estimating relative importance of different combinations of input variables, thereby adding an element of insight to a general machine learning method. The capabilities of this regression tool are demonstrated on test cases involving synthetic analytic functions, the potential energy surface of the water molecule, kinetic energy densities of materials (crystalline magnesium, aluminum, and silicon), and financial market data.