LG MLMay 22

Characterizing the Representational Capacity of Neural Processes

arXiv:2605.242106.0

AI Analysis

Provides theoretical guidance for architecture selection in meta-learning tasks by clarifying which function classes each NP variant can represent.

The paper characterizes the representational capacity of Neural Process architectures, proving a strict hierarchy among CNPs, ANPs, ConvCNPs, and TNPs, and showing that latent NPs require latent dimension scaling with context size to match GP posteriors.

What functions can Neural Processes represent? We analyze the representational capacity of popular NP architectures: Conditional Neural Processes (CNPs), Attentive Neural Processes (ANPs), Transformer Neural Processes (TNPs), and their latent variants. We prove these architectures form a strict hierarchy. CNP-representable functions are exactly those depending on finitely many expected features of the context distribution. ANPs strictly generalize CNPs via query-dependent reweighting, enabling kernel smoothers. ConvCNPs and ANPs are incomparable; each contains functions outside the other, separated by stationarity versus translation equivariance. TNPs with $L$ self-attention layers capture $L$-hop context interactions. For latent NPs, we show finite-dimensional latents provide coherent sampling but do not circumvent encoder limitations; matching GP posterior distributions requires latent dimension scaling with context size. These results provide a theoretical foundation for architecture selection based on task structure.

View on arXiv PDF

Similar