LGOct 2, 2023
Transformers are efficient hierarchical chemical graph learnersZihan Pengmei, Zimu Li, Chih-chan Tien et al.
Transformers, adapted from natural language processing, are emerging as a leading approach for graph representation learning. Contemporary graph transformers often treat nodes or edges as separate tokens. This approach leads to computational challenges for even moderately-sized graphs due to the quadratic scaling of self-attention complexity with token count. In this paper, we introduce SubFormer, a graph transformer that operates on subgraphs that aggregate information by a message-passing mechanism. This approach reduces the number of tokens and enhances learning long-range interactions. We demonstrate SubFormer on benchmarks for predicting molecular properties from chemical structures and show that it is competitive with state-of-the-art graph transformers at a fraction of the computational cost, with training times on the order of minutes on a consumer-grade graphics card. We interpret the attention weights in terms of chemical structures. We show that SubFormer exhibits limited over-smoothing and avoids over-squashing, which is prevalent in traditional graph neural networks.
LGSep 30, 2024
Using pretrained graph neural networks with token mixers as geometric featurizers for conformational dynamicsZihan Pengmei, Chatipat Lorpaiboon, Spencer C. Guo et al.
Identifying informative low-dimensional features that characterize dynamics in molecular simulations remains a challenge, often requiring extensive manual tuning and system-specific knowledge. Here, we introduce geom2vec, in which pretrained graph neural networks (GNNs) are used as universal geometric featurizers. By pretraining equivariant GNNs on a large dataset of molecular conformations with a self-supervised denoising objective, we obtain transferable structural representations that are useful for learning conformational dynamics without further fine-tuning. We show how the learned GNN representations can capture interpretable relationships between structural units (tokens) by combining them with expressive token mixers. Importantly, decoupling training the GNNs from training for downstream tasks enables analysis of larger molecular graphs (such as small proteins at all-atom resolution) with limited computational resources. In these ways, geom2vec eliminates the need for manual feature selection and increases the robustness of simulation analyses.
CHEM-PHJan 28
Quantum statistics from classical simulations via generative Gibbs samplingWeizhou Wang, Xuanxi Zhang, Jonathan Weare et al.
Accurate simulation of nuclear quantum effects is essential for molecular modeling but expensive using path integral molecular dynamics (PIMD). We present GG-PI, a ring-polymer-based framework that combines generative modeling of the single-bead conditional density with Gibbs sampling to recover quantum statistics from classical simulation data. GG-PI uses inexpensive standard classical simulations or existing data for training and allows transfer across temperatures without retraining. On standard test systems, GG-PI significantly reduces wall clock time compared to PIMD. Our approach extends easily to a wide range of problems with similar Markov structure.
LGMay 11
Composing diffusion priors with explicit physical context via generative Gibbs samplingWeizhou Wang, Jonathan Weare, Aaron R. Dinner
Pretrained diffusion models provide powerful learned priors, but in scientific sampling the target distribution often depends on physical context that is not fully represented by one generative model. We introduce Generative Gibbs for Physics-Aware Sampling (GG-PA), a training-free framework that formulates the composition of learned partial priors and explicit physical context as inference over a joint target distribution in an augmented state space. We derive a Gibbs sampler for this joint target, show that it is asymptotically exact as the diffusion time approaches zero, and prove that in settings with quadratic interactions it remains exact at finite diffusion times. We further introduce replica exchange over diffusion time to accelerate mixing. Experiments on a double-well system, a $ϕ^4$ lattice model, and atomistic peptide systems show that GG-PA recovers context-induced distribution shifts and emergent collective behavior in interacting systems using partial priors without retraining. These results demonstrate GG-PA as a practical approach for combining pretrained generative priors with explicit physical context.
LGMay 6, 2025
Mitigating mode collapse in normalizing flows by annealing with an adaptive schedule: Application to parameter estimationYihang Wang, Chris Chi, Aaron R. Dinner
Normalizing flows (NFs) provide uncorrelated samples from complex distributions, making them an appealing tool for parameter estimation. However, the practical utility of NFs remains limited by their tendency to collapse to a single mode of a multimodal distribution. In this study, we show that annealing with an adaptive schedule based on the effective sample size (ESS) can mitigate mode collapse. We demonstrate that our approach can converge the marginal likelihood for a biochemical oscillator model fit to time-series data in ten-fold less computation time than a widely used ensemble Markov chain Monte Carlo (MCMC) method. We show that the ESS can also be used to reduce variance by pruning the samples. We expect these developments to be of general use for sampling with NFs and discuss potential opportunities for further improvements.
STAT-MECHAug 29, 2025
An exact multiple-time-step variational formulation for the committor and the transition rateChatipat Lorpaiboon, Jonathan Weare, Aaron R. Dinner
For a transition between two stable states, the committor is the probability that the dynamics leads to one stable state before the other. It can be estimated from trajectory data by minimizing an expression for the transition rate that depends on a lag time. We show that an existing such expression is minimized by the exact committor only when the lag time is a single time step, resulting in a biased estimate in practical applications. We introduce an alternative expression that is minimized by the exact committor at any lag time. Numerical tests on benchmark systems demonstrate that our committor and resulting transition rate estimates are much less sensitive to the choice of lag time. We derive an additional expression for the transition rate, relate the transition rate expression to a variational approach for kinetic statistics based on the mean-squared residual, and discuss further numerical considerations with the aid of a decomposition of the error into dynamic modes.