BMDec 2, 2025
Unlocking hidden biomolecular conformational landscapes in diffusion models at inference timeDaniel D. Richman, Jessica Karaguesian, Carl-Mikael Suomivuori et al.
The function of biomolecules such as proteins depends on their ability to interconvert between a wide range of structures or "conformations." Researchers have endeavored for decades to develop computational methods to predict the distribution of conformations, which is far harder to determine experimentally than a static folded structure. We present ConforMix, an inference-time algorithm that enhances sampling of conformational distributions using a combination of classifier guidance, filtering, and free energy estimation. Our approach upgrades diffusion models -- whether trained for static structure prediction or conformational generation -- to enable more efficient discovery of conformational variability without requiring prior knowledge of major degrees of freedom. ConforMix is orthogonal to improvements in model pretraining and would benefit even a hypothetical model that perfectly reproduced the Boltzmann distribution. Remarkably, when applied to a diffusion model trained for static structure prediction, ConforMix captures structural changes including domain motion, cryptic pocket flexibility, and transporter cycling, while avoiding unphysical states. Case studies of biologically critical proteins demonstrate the scalability, accuracy, and utility of this method.
MTRL-SCISep 20, 2024
Learning Ordering in Crystalline Materials with Symmetry-Aware Graph Neural NetworksJiayu Peng, James Damewood, Jessica Karaguesian et al.
Graph convolutional neural networks (GCNNs) have become a machine learning workhorse for screening the chemical space of crystalline materials in fields such as catalysis and energy storage, by predicting properties from structures. Multicomponent materials, however, present a unique challenge since they can exhibit chemical (dis)order, where a given lattice structure can encompass a variety of elemental arrangements ranging from highly ordered structures to fully disordered solid solutions. Critically, properties like stability, strength, and catalytic performance depend not only on structures but also on orderings. To enable rigorous materials design, it is thus critical to ensure GCNNs are capable of distinguishing among atomic orderings. However, the ordering-aware capability of GCNNs has been poorly understood. Here, we benchmark various neural network architectures for capturing the ordering-dependent energetics of multicomponent materials in a custom-made dataset generated with high-throughput atomistic simulations. Conventional symmetry-invariant GCNNs were found unable to discern the structural difference between the diverse symmetrically inequivalent atomic orderings of the same material, while symmetry-equivariant model architectures could inherently preserve and differentiate the distinct crystallographic symmetries of various orderings.