MLLGMay 18, 2020

Variational Hyper-Encoding Networks

arXiv:2005.08482v25 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently representing model parameters for multi-task learning, offering a novel approach that could benefit machine learning practitioners dealing with distributional data, though it appears incremental in building upon existing VAE and hyper-network methods.

The authors tackled the problem of encoding distributions of distributions by proposing HyperVAE, a framework that uses a hyper-level VAE to model neural network parameters, resulting in improved information preservation and generalization in tasks like density estimation and outlier detection.

We propose a framework called HyperVAE for encoding distributions of distributions. When a target distribution is modeled by a VAE, its neural network parameters θis drawn from a distribution p(θ) which is modeled by a hyper-level VAE. We propose a variational inference using Gaussian mixture models to implicitly encode the parameters θinto a low dimensional Gaussian distribution. Given a target distribution, we predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(θ). HyperVAE can encode the parameters θin full in contrast to common hyper-networks practices, which generate only the scale and bias vectors as target-network parameters. Thus HyperVAE preserves much more information about the model for each task in the latent space. We discuss HyperVAE using the minimum description length (MDL) principle and show that it helps HyperVAE to generalize. We evaluate HyperVAE in density estimation tasks, outlier detection and discovery of novel design classes, demonstrating its efficacy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes