Degeneration of kernel regression with Matern kernels into low-order polynomial regression in high dimension
This highlights a critical limitation for researchers using kernel methods in high-dimensional applications like materials informatics, revealing why polynomial approximations can succeed and emphasizing the need for alternative kernels or models.
The paper shows that kernel regression with Matern kernels degenerates into low-order polynomial regression in high-dimensional, sparse data regimes, losing its advantage, as demonstrated theoretically and numerically on six- and fifteen-dimensional molecular potential energy surfaces.
Kernel methods such as kernel ridge regression and Gaussian process regressions with Matern type kernels have been increasingly used, in particular, to fit potential energy surfaces (PES) and density functionals, and for materials informatics. When the dimensionality of the feature space is high, these methods are used with necessarily sparse data. In this regime, the optimal length parameter of a Matern-type kernel tends to become so large that the method effectively degenerates into a low-order polynomial regression and therefore loses any advantage over such regression. This is demonstrated theoretically as well as numerically on the examples of six- and fifteen-dimensional molecular PES using squared exponential and simple exponential kernels. The results shed additional light on the success of polynomial approximations such as PIP for medium size molecules and on the importance of orders-of-coupling based models for preserving the advantages of kernel methods with Matern type kernels or on the use of physically-motivated (reproducing) kernels.