Automatic Identification of Chemical Moieties
This addresses the need for automated chemical analysis in drug discovery and materials science, though it is incremental as it builds on existing MPNN methods.
The authors tackled the problem of automatically identifying chemical moieties from atomic representations in message-passing neural networks, enabling applications like database selection and force field construction without expert knowledge.
In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.