QM LG MLMay 31, 2018

Conformation Clustering of Long MD Protein Dynamics with an Adversarial Autoencoder

arXiv:1805.12313v1

Originality Incremental advance

AI Analysis

This work addresses the need for efficient conformation clustering in protein dynamics analysis, offering a domain-specific tool for researchers in computational biology, though it is incremental as it builds on existing autoencoder techniques.

The paper tackles the problem of clustering protein conformations from long molecular dynamics simulations to identify protein states and understand folding behavior, proposing a novel adversarial autoencoder-based method that successfully identifies salient folding features in a 208-microsecond simulation of the Trp-Cage peptide.

Recent developments in specialized computer hardware have greatly accelerated atomic level Molecular Dynamics (MD) simulations. A single GPU-attached cluster is capable of producing microsecond-length trajectories in reasonable amounts of time. Multiple protein states and a large number of microstates associated with folding and with the function of the protein can be observed as conformations sampled in the trajectories. Clustering those conformations, however, is needed for identifying protein states, evaluating transition rates and understanding protein behavior. In this paper, we propose a novel data-driven generative conformation clustering method based on the adversarial autoencoder (AAE) and provide the associated software implementation Cong. The method was tested using a 208 microseconds MD simulation of the fast-folding peptide Trp-Cage (20 residues) obtained from the D.E. Shaw Research Group. The proposed clustering algorithm identifies many of the salient features of the folding process by grouping a large number of conformations that share common features not easily identifiable in the trajectory.

View on arXiv PDF

Similar