LGMLJun 30, 2021

Improving black-box optimization in VAE latent space using decoder uncertainty

arXiv:2107.00096v172 citations
Originality Incremental advance
AI Analysis

This work addresses robustness issues in black-box optimization for domain-specific applications like drug design and function approximation, though it is incremental as it builds on existing VAE frameworks without architectural changes.

The paper tackled the problem of unreliable decoder outputs in VAE latent space optimization for generating discrete objects like molecules and arithmetic expressions, by introducing an importance sampling-based estimator for decoder epistemic uncertainty to guide the optimization, resulting in improved trade-offs between black-box objectives and sample validity across multiple experimental settings.

Optimization in the latent space of variational autoencoders is a promising approach to generate high-dimensional discrete objects that maximize an expensive black-box property (e.g., drug-likeness in molecular generation, function approximation with arithmetic expressions). However, existing methods lack robustness as they may decide to explore areas of the latent space for which no data was available during training and where the decoder can be unreliable, leading to the generation of unrealistic or invalid objects. We propose to leverage the epistemic uncertainty of the decoder to guide the optimization process. This is not trivial though, as a naive estimation of uncertainty in the high-dimensional and structured settings we consider would result in high estimator variance. To solve this problem, we introduce an importance sampling-based estimator that provides more robust estimates of epistemic uncertainty. Our uncertainty-guided optimization approach does not require modifications of the model architecture nor the training process. It produces samples with a better trade-off between black-box objective and validity of the generated samples, sometimes improving both simultaneously. We illustrate these advantages across several experimental settings in digit generation, arithmetic expression approximation and molecule generation for drug design.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes