Exploiting Inferential Structure in Neural Processes
This work addresses the limitation of NPs in handling complex context set distributions, which is incremental but important for applications requiring robust adaptation.
The authors tackled the problem of Neural Processes (NPs) assuming overly simple latent variable distributions by introducing a framework that allows NPs to use richer priors defined by graphical models, resulting in improved function modeling and test-time robustness.
Neural Processes (NPs) are appealing due to their ability to perform fast adaptation based on a context set. This set is encoded by a latent variable, which is often assumed to follow a simple distribution. However, in real-word settings, the context set may be drawn from richer distributions having multiple modes, heavy tails, etc. In this work, we provide a framework that allows NPs' latent variable to be given a rich prior defined by a graphical model. These distributional assumptions directly translate into an appropriate aggregation strategy for the context set. Moreover, we describe a message-passing procedure that still allows for end-to-end optimization with stochastic gradients. We demonstrate the generality of our framework by using mixture and Student-t assumptions that yield improvements in function modelling and test-time robustness.