LGMLMay 23, 2018

Constrained Graph Variational Autoencoders for Molecule Design

arXiv:1805.09076v2501 citations
Originality Incremental advance
AI Analysis

This work addresses molecule design for drug discovery or materials science, offering an incremental improvement in graph generation methods.

The paper tackles the problem of generating chemical molecules that match the distribution of training data, proposing a graph-structured variational autoencoder with a sequential decoder and showing it outperforms baselines in matching dataset statistics and enabling property-optimized molecule design.

Graphs are ubiquitous data structures for representing interactions between entities. With an emphasis on the use of graphs to represent chemical molecules, we explore the task of learning to generate graphs that conform to a distribution observed in training data. We propose a variational autoencoder model in which both encoder and decoder are graph-structured. Our decoder assumes a sequential ordering of graph extension steps and we discuss and analyze design choices that mitigate the potential downsides of this linearization. Experiments compare our approach with a wide range of baselines on the molecule generation task and show that our method is more successful at matching the statistics of the original dataset on semantically important metrics. Furthermore, we show that by using appropriate shaping of the latent space, our model allows us to design molecules that are (locally) optimal in desired properties.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes