MLLGSep 10, 2025

Gaussian Process Regression -- Neural Network Hybrid with Optimized Redundant Coordinates

arXiv:2509.08457v1h-index: 6Advanced Intelligent Discovery
Originality Incremental advance
AI Analysis

This incremental improvement addresses overfitting issues in machine learning for applications like interatomic potentials and materials informatics, potentially reducing the need for deep neural networks in some cases.

The paper tackles the problem of overfitting in hybrid Gaussian Process Regression-neural network methods by optimizing redundant coordinates with a Monte Carlo algorithm, resulting in opt-GPRNN achieving the lowest test set error with fewer neurons while maintaining robustness against overfitting.

Recently, a Gaussian Process Regression - neural network (GPRNN) hybrid machine learning method was proposed, which is based on additive-kernel GPR in redundant coordinates constructed by rules [J. Phys. Chem. A 127 (2023) 7823]. The method combined the expressive power of an NN with the robustness of linear regression, in particular, with respect to overfitting when the number of neurons is increased beyond optimal. We introduce opt-GPRNN, in which the redundant coordinates of GPRNN are optimized with a Monte Carlo algorithm and show that when combined with optimization of redundant coordinates, GPRNN attains the lowest test set error with much fewer terms / neurons and retains the advantage of avoiding overfitting when the number of neurons is increased beyond optimal value. The method, opt-GPRNN possesses an expressive power closer to that of a multilayer NN and could obviate the need for deep NNs in some applications. With optimized redundant coordinates, a dimensionality reduction regime is also possible. Examples of application to machine learning an interatomic potential and materials informatics are given.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes