Kevin Rossi

9.4DLMar 11

Journal Research Data Policies in Materials Science

Lukas Hörmann, Hemanadhan Myneni, Rwayda Kh. S. Al-Hamd et al.

Open and reproducible research in materials science relies on the availability of data, code, and common metadata standards. Journal research data policies (RDPs) remain a primary mechanism by which publication norms are defined and enforced. We survey RDPs for 171 materials science journals spanning 17 publishers, using an expanded coding framework that captures both data-and-code sharing behavior as well as refereeing standards. We find clear signs of progress in comparison to earlier research on RDPs: nearly all journals provide an RDP, and most mention data availability statements. However, enforceable requirements remain uncommon, public deposition of underlying data is rarely mandatory, and FAIR publication is typically encouraged rather than required. Expectations for research software are substantially less developed than those for data, with limited attention to versioning and persistent identifiers, dependency disclosure, reproducible execution environments, or software quality practices. Aggregating the findings on policy features into an open research data score reveals pronounced heterogeneity across journals. Neither impact factor nor access model reliably predicts policy strength. Double-coding further shows that more complex policies and stricter policies can be more challenging to interpret consistently, and we highlight challenges in consistent RDP encoding across studies. Lastly, we conclude with recommended best practice directions for the future.

CHEM-PHNov 10, 2020

Uncertainty estimation for molecular dynamics and sampling

Giulio Imbalzano, Yongbin Zhuang, Venkat Kapil et al.

Machine learning models have emerged as a very effective strategy to sidestep time-consuming electronic-structure calculations, enabling accurate simulations of greater size, time scale and complexity. Given the interpolative nature of these models, the reliability of predictions depends on the position in phase space, and it is crucial to obtain an estimate of the error that derives from the finite number of reference structures included during the training of the model. When using a machine-learning potential to sample a finite-temperature ensemble, the uncertainty on individual configurations translates into an error on thermodynamic averages, and provides an indication for the loss of accuracy when the simulation enters a previously unexplored region. Here we discuss how uncertainty quantification can be used, together with a baseline energy model, or a more robust although less accurate interatomic potential, to obtain more resilient simulations and to support active-learning strategies. Furthermore, we introduce an on-the-fly reweighing scheme that makes it possible to estimate the uncertainty in the thermodynamic averages extracted from long trajectories. We present examples covering different types of structural and thermodynamic properties, and systems as diverse as water and liquid gallium.

Kevin Rossi

2 Papers