Danny Perez

MTRL-SCI
h-index19
5papers
12citations
Novelty43%
AI Score41

5 Papers

NASep 22, 2014
Analysis of Transition State Theory Rates upon Spatial Coarse-Graining

Andrew Binder, Mitchell Luskin, Danny Perez et al.

Spatial multiscale methods have established themselves as useful tools for extending the length scales accessible by conventional statics (i.e., zero temperature molecular dynamics). Recently, extensions of these methods, such as the finite-temperature quasicontinuum (hot-QC) or Coarse-Grained Molecular Dynamics (CGMD) methods, have allowed for multiscale molecular dynamics simulations at finite temperature. Here, we assess the quality of the long-time dynamics these methods generate by considering canonical transition rates. Specifically, we analyze the transition state theory (TST) rates in CGMD and compare them to the corresponding TST rate of the fully atomistic system. The ability of such an approach to reliably reproduce the TST rate is verified through a relative error analysis, which is then used to highlight the major contributions to the error and guide the choice of degrees of freedom. Finally, our analytical results are compared with numerical simulations for the case of a 1-D chain.

4.1MTRL-SCIApr 29
Predicting Atomistic Transitions with Transformers

Henry Tischler, Wenting Li, Qi Tang et al.

Accurate knowledge of the atomistic transition pathways in materials and material surfaces is crucial for many material science problems. However, conventional simulation techniques used to find these transitions are extremely computationally intensive. Even with large-scale, accelerated material simulations, the computational cost constrains the applicable domain in practice. Machine learning models, with the potential to learn the complex emergent behaviors governing atomistic transitions as a fast surrogate model, have great promise to predict transitions with a vastly reduced computational cost. Here, we demonstrate how transformers can be trained to predict atomistic transitions in nano-clusters. We show how we evaluate physical validity of the predictions and how a multitude of additional, different microstates can be generated by slightly varying the data provided to the model.

MLFeb 2, 2024
Parameter uncertainties for imperfect surrogate models in the low-noise regime

Thomas D Swinburne, Danny Perez

Bayesian regression determines model parameters by minimizing the expected loss, an upper bound to the true generalization error. However, the loss ignores misspecification, where models are imperfect. Parameter uncertainties from Bayesian regression are thus significantly underestimated and vanish in the large data limit. This is particularly problematic when building models of low-noise, or near-deterministic, calculations, as the main source of uncertainty is neglected. We analyze the generalization error of misspecified, near-deterministic surrogate models, a regime of broad relevance in science and engineering. We show posterior distributions must cover every training point to avoid a divergent generalization error and design an ansatz that respects this constraint, which for linear models incurs minimal overhead. This is demonstrated on model problems before application to thousand dimensional datasets in atomistic machine learning. Our efficient misspecification-aware scheme gives accurate prediction and bounding of test errors where existing schemes fail, allowing this important source of uncertainty to be incorporated in computational workflows.

LGJun 13, 2025
Accurate and Uncertainty-Aware Multi-Task Prediction of HEA Properties Using Prior-Guided Deep Gaussian Processes

Sk Md Ahnaf Akif Alvi, Mrinalini Mulukutla, Nicolas Flores et al.

Surrogate modeling techniques have become indispensable in accelerating the discovery and optimization of high-entropy alloys(HEAs), especially when integrating computational predictions with sparse experimental observations. This study systematically evaluates the fitting performance of four prominent surrogate models conventional Gaussian Processes(cGP), Deep Gaussian Processes(DGP), encoder-decoder neural networks for multi-output regression and XGBoost applied to a hybrid dataset of experimental and computational properties in the AlCoCrCuFeMnNiV HEA system. We specifically assess their capabilities in predicting correlated material properties, including yield strength, hardness, modulus, ultimate tensile strength, elongation, and average hardness under dynamic and quasi-static conditions, alongside auxiliary computational properties. The comparison highlights the strengths of hierarchical and deep modeling approaches in handling heteroscedastic, heterotopic, and incomplete data commonly encountered in materials informatics. Our findings illustrate that DGP infused with machine learning-based prior outperform other surrogates by effectively capturing inter-property correlations and input-dependent uncertainty. This enhanced predictive accuracy positions advanced surrogate models as powerful tools for robust and data-efficient materials design.

MTRL-SCISep 17, 2025
Deep Gaussian Process-based Cost-Aware Batch Bayesian Optimization for Complex Materials Design Campaigns

Sk Md Ahnaf Akif Alvi, Brent Vela, Vahid Attari et al.

The accelerating pace and expanding scope of materials discovery demand optimization frameworks that efficiently navigate vast, nonlinear design spaces while judiciously allocating limited evaluation resources. We present a cost-aware, batch Bayesian optimization scheme powered by deep Gaussian process (DGP) surrogates and a heterotopic querying strategy. Our DGP surrogate, formed by stacking GP layers, models complex hierarchical relationships among high-dimensional compositional features and captures correlations across multiple target properties, propagating uncertainty through successive layers. We integrate evaluation cost into an upper-confidence-bound acquisition extension, which, together with heterotopic querying, proposes small batches of candidates in parallel, balancing exploration of under-characterized regions with exploitation of high-mean, low-variance predictions across correlated properties. Applied to refractory high-entropy alloys for high-temperature applications, our framework converges to optimal formulations in fewer iterations with cost-aware queries than conventional GP-based BO, highlighting the value of deep, uncertainty-aware, cost-sensitive strategies in materials campaigns.