LGQMMay 4, 2023

Are VAEs Bad at Reconstructing Molecular Graphs?

arXiv:2305.03041v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the performance gap in molecular graph reconstruction for researchers in computational chemistry and machine learning, though it is incremental as it clarifies existing limitations without proposing a new solution.

The study evaluated the reconstruction accuracy of state-of-the-art variational auto-encoders on molecular graphs, finding it surprisingly low on a large, diverse dataset, but showed that improving reconstruction does not enhance sampling or optimization performance.

Many contemporary generative models of molecules are variational auto-encoders of molecular graphs. One term in their training loss pertains to reconstructing the input, yet reconstruction capabilities of state-of-the-art models have not yet been thoroughly compared on a large and chemically diverse dataset. In this work, we show that when several state-of-the-art generative models are evaluated under the same conditions, their reconstruction accuracy is surprisingly low, worse than what was previously reported on seemingly harder datasets. However, we show that improving reconstruction does not directly lead to better sampling or optimization performance. Failed reconstructions from the MoLeR model are usually similar to the inputs, assembling the same motifs in a different way, and possess similar chemical properties such as solubility. Finally, we show that the input molecule and its failed reconstruction are usually mapped by the different encoders to statistically distinguishable posterior distributions, hinting that posterior collapse may not fully explain why VAEs are bad at reconstructing molecular graphs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes