Federico Grasselli

MTRL-SCI
h-index8
4papers
34citations
Novelty43%
AI Score35

4 Papers

MTRL-SCIMay 11, 2022
Predicting hot-electron free energies from ground-state data

Chiheb Ben Mahmoud, Federico Grasselli, Michele Ceriotti

Machine-learning potentials are usually trained on the ground-state, Born-Oppenheimer energy surface, which depends exclusively on the atomic positions and not on the simulation temperature. This disregards the effect of thermally-excited electrons, that is important in metals, and essential to the description of warm dense matter. An accurate physical description of these effects requires that the nuclei move on a temperature-dependent electronic free energy. We propose a method to obtain machine-learning predictions of this free energy at an arbitrary electron temperature using exclusively training data from ground-state calculations, avoiding the need to train temperature-dependent potentials, and benchmark it on metallic liquid hydrogen at the conditions of the core of gas giants and brown dwarfs. This work demonstrates the advantages of hybrid schemes that use physical consideration to combine machine-learning predictions, providing a blueprint for the development of similar approaches that extend the reach of atomistic modelling by removing the barrier between physics and data-driven methodologies.

9.4DLMar 11
Journal Research Data Policies in Materials Science

Lukas Hörmann, Hemanadhan Myneni, Rwayda Kh. S. Al-Hamd et al.

Open and reproducible research in materials science relies on the availability of data, code, and common metadata standards. Journal research data policies (RDPs) remain a primary mechanism by which publication norms are defined and enforced. We survey RDPs for 171 materials science journals spanning 17 publishers, using an expanded coding framework that captures both data-and-code sharing behavior as well as refereeing standards. We find clear signs of progress in comparison to earlier research on RDPs: nearly all journals provide an RDP, and most mention data availability statements. However, enforceable requirements remain uncommon, public deposition of underlying data is rarely mandatory, and FAIR publication is typically encouraged rather than required. Expectations for research software are substantially less developed than those for data, with limited attention to versioning and persistent identifiers, dependency disclosure, reproducible execution environments, or software quality practices. Aggregating the findings on policy features into an open research data score reveals pronounced heterogeneity across journals. Neither impact factor nor access model reliably predicts policy strength. Double-coding further shows that more complex policies and stricter policies can be more challenging to interpret consistently, and we highlight challenges in consistent RDP encoding across studies. Lastly, we conclude with recommended best practice directions for the future.

MLMar 4, 2024
A prediction rigidity formalism for low-cost uncertainties in trained neural networks

Filippo Bigi, Sanggyu Chong, Michele Ceriotti et al.

Regression methods are fundamental for scientific and technological applications. However, fitted models can be highly unreliable outside of their training domain, and hence the quantification of their uncertainty is crucial in many of their applications. Based on the solution of a constrained optimization problem, we propose "prediction rigidities" as a method to obtain uncertainties of arbitrary pre-trained regressors. We establish a strong connection between our framework and Bayesian inference, and we develop a last-layer approximation that allows the new method to be applied to neural networks. This extension affords cheap uncertainties without any modification to the neural network itself or its training procedure. We show the effectiveness of our method on a wide range of regression tasks, ranging from simple toy models to applications in chemistry and meteorology.

CHEM-PHNov 10, 2020
Uncertainty estimation for molecular dynamics and sampling

Giulio Imbalzano, Yongbin Zhuang, Venkat Kapil et al.

Machine learning models have emerged as a very effective strategy to sidestep time-consuming electronic-structure calculations, enabling accurate simulations of greater size, time scale and complexity. Given the interpolative nature of these models, the reliability of predictions depends on the position in phase space, and it is crucial to obtain an estimate of the error that derives from the finite number of reference structures included during the training of the model. When using a machine-learning potential to sample a finite-temperature ensemble, the uncertainty on individual configurations translates into an error on thermodynamic averages, and provides an indication for the loss of accuracy when the simulation enters a previously unexplored region. Here we discuss how uncertainty quantification can be used, together with a baseline energy model, or a more robust although less accurate interatomic potential, to obtain more resilient simulations and to support active-learning strategies. Furthermore, we introduce an on-the-fly reweighing scheme that makes it possible to estimate the uncertainty in the thermodynamic averages extracted from long trajectories. We present examples covering different types of structural and thermodynamic properties, and systems as diverse as water and liquid gallium.