Learning Collective Variables from Time-lagged Generation
This addresses the challenge of accurately sampling rare events in molecular dynamics for researchers in computational chemistry and biophysics, representing an incremental improvement over existing machine learning CV methods.
The paper tackles the problem of learning collective variables (CVs) for enhanced sampling in molecular dynamics by proposing TLC, a framework that learns CVs from time-lagged conditions of a generative model to capture slow dynamic behavior, demonstrating equal or superior performance to existing methods in tasks like steered molecular dynamics and on-the-fly probability enhanced sampling on the Alanine Dipeptide system.
Rare events such as state transitions are difficult to observe directly with molecular dynamics simulations due to long timescales. Enhanced sampling techniques overcome this by introducing biases along carefully chosen low-dimensional features, known as collective variables (CVs), which capture the slow degrees of freedom. Machine learning approaches (MLCVs) have automated CV discovery, but existing methods typically focus on discriminating meta-stable states without fully encoding the detailed dynamics essential for accurate sampling. We propose TLC, a framework that learns CVs directly from time-lagged conditions of a generative model. Instead of modeling the static Boltzmann distribution, TLC models a time-lagged conditional distribution yielding CVs to capture the slow dynamic behavior. We validate TLC on the Alanine Dipeptide system using two CV-based enhanced sampling tasks: (i) steered molecular dynamics (SMD) and (ii) on-the-fly probability enhanced sampling (OPES), demonstrating equal or superior performance compared to existing MLCV methods in both transition path sampling and state discrimination.