Do Deep Learning Methods Really Perform Better in Molecular Conformation Generation?
This work challenges the assumed superiority of deep learning in MCG for drug discovery, suggesting that traditional methods can be effective and highlighting a need for revision in the field.
The authors tackled the problem of molecular conformation generation (MCG) by showing that a simple, parameter-free algorithm based on clustering RDKIT-generated conformations performs comparably to or better than deep learning methods on standard benchmarks like GEOM-QM9 and GEOM-Drugs.
Molecular conformation generation (MCG) is a fundamental and important problem in drug discovery. Many traditional methods have been developed to solve the MCG problem, such as systematic searching, model-building, random searching, distance geometry, molecular dynamics, Monte Carlo methods, etc. However, they have some limitations depending on the molecular structures. Recently, there are plenty of deep learning based MCG methods, which claim they largely outperform the traditional methods. However, to our surprise, we design a simple and cheap algorithm (parameter-free) based on the traditional methods and find it is comparable to or even outperforms deep learning based MCG methods in the widely used GEOM-QM9 and GEOM-Drugs benchmarks. In particular, our design algorithm is simply the clustering of the RDKIT-generated conformations. We hope our findings can help the community to revise the deep learning methods for MCG. The code of the proposed algorithm could be found at https://gist.github.com/ZhouGengmo/5b565f51adafcd911c0bc115b2ef027c.