LGAIAug 23, 2022

String-based Molecule Generation via Multi-decoder VAE

arXiv:2208.10718v14 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses molecular generation for drug discovery or materials science, but it is incremental as it builds on existing VAE methods with ensemble techniques.

The paper tackles the problem of generating molecules using string-based variational autoencoders (VAEs) by proposing a multi-decoder ensemble approach to improve performance, particularly for out-of-domain samples, achieving enhanced results as indicated by experimental outcomes.

In this paper, we investigate the problem of string-based molecular generation via variational autoencoders (VAEs) that have served a popular generative approach for various tasks in artificial intelligence. We propose a simple, yet effective idea to improve the performance of VAE for the task. Our main idea is to maintain multiple decoders while sharing a single encoder, i.e., it is a type of ensemble techniques. Here, we first found that training each decoder independently may not be effective as the bias of the ensemble decoder increases severely under its auto-regressive inference. To maintain both small bias and variance of the ensemble model, our proposed technique is two-fold: (a) a different latent variable is sampled for each decoder (from estimated mean and variance offered by the shared encoder) to encourage diverse characteristics of decoders and (b) a collaborative loss is used during training to control the aggregated quality of decoders using different latent variables. In our experiments, the proposed VAE model particularly performs well for generating a sample from out-of-domain distribution.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes