On the inability of Gaussian process regression to optimally learn compositional functions
This is a foundational theoretical result for machine learning practitioners and researchers, identifying a fundamental limitation of Gaussian processes.
The paper proves that Gaussian process regression cannot achieve optimal learning rates for compositional functions, showing that the posterior contraction rate is polynomially slower than the minimax rate when the true function has a generalized additive structure.
We rigorously prove that deep Gaussian process priors can outperform Gaussian process priors if the target function has a compositional structure. To this end, we study information-theoretic lower bounds for posterior contraction rates for Gaussian process regression in a continuous regression model. We show that if the true function is a generalized additive function, then the posterior based on any mean-zero Gaussian process can only recover the truth at a rate that is strictly slower than the minimax rate by a factor that is polynomially suboptimal in the sample size $n$.