Permutation-preserving Functions and Neural Vecchia Covariance Kernels
This work addresses the problem of learning expressive, non-stationary kernels for Gaussian processes in a scalable manner, which is important for practitioners in spatial statistics and machine learning.
The paper introduces a framework for learning scalable and flexible covariance kernels for Gaussian processes by modeling Vecchia approximation parameters with deep neural networks, achieving improved training stability and data efficiency while maintaining computational scalability.
We introduce a novel framework for constructing scalable and flexible covariance kernels for Gaussian processes (GPs) by directly learning the covariance structure under a regression-type parameterization induced by Vecchia approximations, using deep neural architectures. Specifically, we model kriging coefficients and conditional standard deviations, deterministic quantities that uniquely characterize the covariance, providing stable and informative learning targets. Exploiting the permutation-equivariant structure of conditioning sets in the Vecchia factorization, we derive a universal representation for permutation-preserving functions and design neural architectures that respect this symmetry, leading to improved training stability and data efficiency. The proposed approach enables expressive, non-stationary kernel learning while maintaining computational scalability, thereby bridging classical GP methodology with modern deep learning.