LGBMJan 8, 2024

Scalable Normalizing Flows Enable Boltzmann Generators for Macromolecules

arXiv:2401.04246v112 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently generating Boltzmann distributions for proteins, which is crucial for drug discovery and structural biology, representing a domain-specific advancement.

The authors tackled the intractability of modeling protein conformational distributions with normalizing flows for macromolecules by introducing a novel flow architecture with split channels and gated attention, along with a multi-stage training strategy using a 2-Wasserstein loss, enabling successful modeling of proteins like HP35 (35 residues) and protein G (56 residues) where standard methods fail.

The Boltzmann distribution of a protein provides a roadmap to all of its functional states. Normalizing flows are a promising tool for modeling this distribution, but current methods are intractable for typical pharmacological targets; they become computationally intractable due to the size of the system, heterogeneity of intra-molecular potential energy, and long-range interactions. To remedy these issues, we present a novel flow architecture that utilizes split channels and gated attention to efficiently learn the conformational distribution of proteins defined by internal coordinates. We show that by utilizing a 2-Wasserstein loss, one can smooth the transition from maximum likelihood training to energy-based training, enabling the training of Boltzmann Generators for macromolecules. We evaluate our model and training strategy on villin headpiece HP35(nle-nle), a 35-residue subdomain, and protein G, a 56-residue protein. We demonstrate that standard architectures and training strategies, such as maximum likelihood alone, fail while our novel architecture and multi-stage training strategy are able to model the conformational distributions of protein G and HP35.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes