BMSTAT-MECHLGJan 19, 2024

Clustering Molecular Energy Landscapes by Adaptive Network Embedding

arXiv:2401.10972v1J Mater Informatics
Originality Incremental advance
AI Analysis

This is an incremental method for computational chemistry researchers to compress molecular data for machine learning tasks.

The paper tackles the problem of exploring chemical space by clustering molecular energy landscapes using network embedding to obtain latent variables, and demonstrates the framework on Lennard-Jones clusters and a human DNA sequence.

In order to efficiently explore the chemical space of all possible small molecules, a common approach is to compress the dimension of the system to facilitate downstream machine learning tasks. Towards this end, we present a data driven approach for clustering potential energy landscapes of molecular structures by applying recently developed Network Embedding techniques, to obtain latent variables defined through the embedding function. To scale up the method, we also incorporate an entropy sensitive adaptive scheme for hierarchical sampling of the energy landscape, based on Metadynamics and Transition Path Theory. By taking into account the kinetic information implied by a system's energy landscape, we are able to interpret dynamical node-node relationships in reduced dimensions. We demonstrate the framework through Lennard-Jones (LJ) clusters and a human DNA sequence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes