LGNov 13, 2025
Pretrained Joint Predictions for Scalable Batch Bayesian Optimization of Molecular DesignsMiles Wang-Henderson, Benjamin Kaufman, Edward Williams et al.
Batched synthesis and testing of molecular designs is the key bottleneck of drug development. There has been great interest in leveraging biomolecular foundation models as surrogates to accelerate this process. In this work, we show how to obtain scalable probabilistic surrogates of binding affinity for use in Batch Bayesian Optimization (Batch BO). This demands parallel acquisition functions that hedge between designs and the ability to rapidly sample from a joint predictive density to approximate them. Through the framework of Epistemic Neural Networks (ENNs), we obtain scalable joint predictive distributions of binding affinity on top of representations taken from large structure-informed models. Key to this work is an investigation into the importance of prior networks in ENNs and how to pretrain them on synthetic data to improve downstream performance in Batch BO. Their utility is demonstrated by rediscovering known potent EGFR inhibitors on a semi-synthetic benchmark in up to 5x fewer iterations, as well as potent inhibitors from a real-world small-molecule library in up to 10x fewer iterations, offering a promising solution for large-scale drug discovery applications.
CHEM-PHOct 28, 2021
How Well Does Kohn-Sham Regularizer Work for Weakly Correlated Systems?Bhupalee Kalita, Ryan Pederson, Jielun Chen et al.
Kohn-Sham regularizer (KSR) is a differentiable machine learning approach to finding the exchange-correlation functional in Kohn-Sham density functional theory (DFT) that works for strongly correlated systems. Here we test KSR for weak correlation. We propose spin-adapted KSR (sKSR) with trainable local, semilocal, and nonlocal approximations found by minimizing density and total energy loss. We assess the atoms-to-molecules generalizability by training on one-dimensional (1D) H, He, Li, Be, Be$^{++}$ and testing on 1D hydrogen chains, LiH, BeH$_2$, and helium hydride complexes. The generalization error from our semilocal approximation is comparable to other differentiable approaches, but our nonlocal functional outperforms any existing machine learning functionals, predicting ground-state energies of test systems with a mean absolute error of 2.7 milli-Hartrees.
COMP-PHSep 17, 2020
Kohn-Sham equations as regularizer: building prior knowledge into machine-learned physicsLi Li, Stephan Hoyer, Ryan Pederson et al.
Including prior knowledge is important for effective machine learning models in physics, and is usually achieved by explicitly adding loss terms or constraints on model architectures. Prior knowledge embedded in the physics computation itself rarely draws attention. We show that solving the Kohn-Sham equations when training neural networks for the exchange-correlation functional provides an implicit regularization that greatly improves generalization. Two separations suffice for learning the entire one-dimensional H$_2$ dissociation curve within chemical accuracy, including the strongly correlated region. Our models also generalize to unseen types of molecules and overcome self-interaction error.