LGMLFeb 9, 2023

Gaussian Process-Gated Hierarchical Mixtures of Experts

arXiv:2302.04947v25 citationsh-index: 47
Originality Incremental advance
AI Analysis

This work addresses the need for more interpretable and efficient deep Bayesian models, such as deep GPs and neural networks, but it is incremental as it builds on existing mixtures of experts with novel gating functions.

The paper tackles the problem of improving mixtures of experts by proposing Gaussian process-gated hierarchical mixtures of experts (GPHMEs), which use Gaussian processes for both gating and experts, and it results in outperforming tree-based benchmarks with reduced complexity and good performance on large-scale datasets.

In this paper, we propose novel Gaussian process-gated hierarchical mixtures of experts (GPHMEs). Unlike other mixtures of experts with gating models linear in the input, our model employs gating functions built with Gaussian processes (GPs). These processes are based on random features that are non-linear functions of the inputs. Furthermore, the experts in our model are also constructed with GPs. The optimization of the GPHMEs is performed by variational inference. The proposed GPHMEs have several advantages. They outperform tree-based HME benchmarks that partition the data in the input space, and they achieve good performance with reduced complexity. Another advantage is the interpretability they provide for deep GPs, and more generally, for deep Bayesian neural networks. Our GPHMEs demonstrate excellent performance for large-scale data sets, even with quite modest sizes.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes