BMCHEM-PHMLDec 6, 2021

Collective variable discovery in the age of machine learning: reality, hype and everything in between

arXiv:2112.03202v135 citations
Originality Synthesis-oriented
AI Analysis

It addresses the challenge of selecting effective low-dimensional variables for researchers in computational chemistry and drug discovery, but is incremental as it reviews existing approaches without presenting new results.

This review examines the use of machine learning to discover collective variables for analyzing biomolecular dynamics from molecular dynamics simulations, highlighting cases where such methods are applied unnecessarily to simple systems and proposing future applications with artificial general intelligence.

Understanding kinetics and thermodynamics profile of biomolecules is necessary to understand their functional roles which has a major impact in mechanism driven drug discovery. Molecular dynamics simulation has been routinely used to understand conformational dynamics and molecular recognition in biomolecules. Statistical analysis of high-dimensional spatiotemporal data generated from molecular dynamics simulation requires identification of few low-dimensional variables which can describe essential dynamics of a system without significant loss of informations. In physical chemistry, these low-dimensional variables often called collective variables. Collective variables are used to generated reduced representation of free energy surface and calculate transition probabilities between different metastable basins. However the choice of collective variables is not trivial for complex systems. Collective variables ranges from geometric criteria's such as distances, dihedral angles to abstract ones such as weighted linear combinations of multiple geometric variables. Advent of machine learning algorithms led to increasing use of abstract collective variables to represent biomolecular dynamics. In this review, I will highlight several nuances of commonly used collective variables ranging from geometric to abstract ones. Further, I will put forward some cases where machine learning based collective variables were used to describe simple systems which in principle could have been described by geometric ones. Finally, I will put forward my thoughts on artificial general intelligence and how it can be used to discover and predict collective variables from spatiotemporal data generated by molecular dynamics simulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes