Luis Itza Vazquez-Salazar

h-index13
2papers

2 Papers

CHEM-PHJul 21, 2024
${\it Asparagus}$: A Toolkit for Autonomous, User-Guided Construction of Machine-Learned Potential Energy Surfaces

Kai Töpfer, Luis Itza Vazquez-Salazar, Markus Meuwly

With the establishment of machine learning (ML) techniques in the scientific community, the construction of ML potential energy surfaces (ML-PES) has become a standard process in physics and chemistry. So far, improvements in the construction of ML-PES models have been conducted independently, creating an initial hurdle for new users to overcome and complicating the reproducibility of results. Aiming to reduce the bar for the extensive use of ML-PES, we introduce ${\it Asparagus}$, a software package encompassing the different parts into one coherent implementation that allows an autonomous, user-guided construction of ML-PES models. ${\it Asparagus}$ combines capabilities of initial data sampling with interfaces to ${\it ab initio}$ calculation programs, ML model training, as well as model evaluation and its application within other codes such as ASE or CHARMM. The functionalities of the code are illustrated in different examples, including the dynamics of small molecules, the representation of reactive potentials in organometallic compounds, and atom diffusion on periodic surface structures. The modular framework of ${\it Asparagus}$ is designed to allow simple implementations of further ML-related methods and models to provide constant user-friendly access to state-of-the-art ML techniques.

CHEM-PHFeb 27, 2024
Outlier-Detection for Reactive Machine Learned Potential Energy Surfaces

Luis Itza Vazquez-Salazar, Silvan Käser, Markus Meuwly

Uncertainty quantification (UQ) to detect samples with large expected errors (outliers) is applied to reactive molecular potential energy surfaces (PESs). Three methods - Ensembles, Deep Evidential Regression (DER), and Gaussian Mixture Models (GMM) - were applied to the H-transfer reaction between ${\it syn-}$Criegee and vinyl hydroxyperoxide. The results indicate that ensemble models provide the best results for detecting outliers, followed by GMM. For example, from a pool of 1000 structures with the largest uncertainty, the detection quality for outliers is $\sim 90$ \% and $\sim 50$ \%, respectively, if 25 or 1000 structures with large errors are sought. On the contrary, the limitations of the statistical assumptions of DER greatly impacted its prediction capabilities. Finally, a structure-based indicator was found to be correlated with large average error, which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.