LG AIMay 29, 2023

Autoencoding Conditional Neural Processes for Representation Learning

Victor Prokhorov, Ivan Titov, N. Siddharth

arXiv:2305.18485v22.0Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient and meaningful representation learning in visual data for applications like image completion and classification, though it is incremental as it builds on existing CNP methods.

The paper tackled the problem of selecting which pixels to observe for training Conditional Neural Processes (CNPs) in image completion, developing the PPS-VAE framework that learns context pixels as latent variables. The result showed improved CNP fitting and meaningful image characterization, with evaluation through classification tasks on within and out-of-data distributions.

Conditional neural processes (CNPs) are a flexible and efficient family of models that learn to learn a stochastic process from data. They have seen particular application in contextual image completion - observing pixel values at some locations to predict a distribution over values at other unobserved locations. However, the choice of pixels in learning CNPs is typically either random or derived from a simple statistical measure (e.g. pixel variance). Here, we turn the problem on its head and ask: which pixels would a CNP like to observe - do they facilitate fitting better CNPs, and do such pixels tell us something meaningful about the underlying image? To this end we develop the Partial Pixel Space Variational Autoencoder (PPS-VAE), an amortised variational framework that casts CNP context as latent variables learnt simultaneously with the CNP. We evaluate PPS-VAE over a number of tasks across different visual data, and find that not only can it facilitate better-fit CNPs, but also that the spatial arrangement and values meaningfully characterise image information - evaluated through the lens of classification on both within and out-of-data distributions. Our model additionally allows for dynamic adaption of context-set size and the ability to scale-up to larger images, providing a promising avenue to explore learning meaningful and effective visual representations.

View on arXiv PDF Code

Similar