Giovanni Pizzi

MTRL-SCI
4papers
577citations
Novelty48%
AI Score51

4 Papers

77.6DCMay 26Code
Accelerating discovery across scientific disciplines through reproducible workflows with AiiDAlab

Aliaksandr V. Yakutovich, Daniel Hollas, Edan Bainglass et al.

With ever-increasing computational capabilities, robust and automated research workflows have become essential for orchestrating large numbers of interdependent simulations. However, significant technical expertise is still required to configure execution environments, define calculation inputs, interpret outputs, and manage the complexity of parallel code execution on remote machines. To address these challenges, we developed AiiDAlab, a Jupyter-based web platform powered by the AiiDA computational infrastructure that provides a framework for managing and automating computational workflows while ensuring reproducibility through full provenance tracking. Through a collection of open-source user-friendly applications, AiiDAlab enables scientists to set up, execute, and analyze complex computational workflows without interacting directly with the underlying technical details, allowing them to focus on their research questions. In this paper, we discuss how AiiDAlab has matured over the past few years, expanding beyond computational materials science and its AiiDA origins. We present recent developments towards integrating with electronic laboratory notebooks (ELNs) for FAIR-compliant data management, adoption in large-scale facilities for secure access to experimental data and analytical tools, and applications in educational settings. Together with community-driven efforts to simplify onboarding, improve access to computational resources, and support large-scale data workflows, these advancements position AiiDAlab as a powerful platform for accelerating scientific discovery and fostering collaboration across disciplines.

MTRL-SCIJul 18, 2022
Machine-learning accelerated identification of exfoliable two-dimensional materials

Mohammad Tohidi Vahdat, Kumar Agrawal Varoon, Giovanni Pizzi

Two-dimensional (2D) materials have been a central focus of recent research because they host a variety of properties, making them attractive both for fundamental science and for applications. It is thus crucial to be able to identify accurately and efficiently if bulk three-dimensional (3D) materials are formed by layers held together by a weak binding energy that, thus, can be potentially exfoliated into 2D materials. In this work, we develop a machine-learning (ML) approach that, combined with a fast preliminary geometrical screening, is able to efficiently identify potentially exfoliable materials. Starting from a combination of descriptors for crystal structures, we work out a subset of them that are crucial for accurate predictions. Our final ML model, based on a random forest classifier, has a very high recall of 98\%. Using a SHapely Additive exPlanations (SHAP) analysis, we also provide an intuitive explanation of the five most important variables of the model. Finally, we compare the performance of our best ML model with a deep neural network architecture using the same descriptors. To make our algorithms and models easily accessible, we publish an online tool on the Materials Cloud portal that only requires a bulk 3D crystal structure as input. Our tool thus provides a practical yet straightforward approach to assess whether any 3D compound can be exfoliated into 2D layers.

18.6DBMar 12
optimade-maker: Automated generation of interoperable materials APIs from static data

Kristjan Eimre, Matthew L. Evans, Bud Macaulay et al.

Atomistic structural data are central to materials science, condensed matter physics, and chemistry, and are increasingly digitised across diverse repositories and databases. Interoperable access to these heterogeneous data sources enables reusable clients and tools, and is essential for cross-database analyses and data-driven materials discovery. Toward this aim, the OPTIMADE (Open Databases Integration for Materials Design) specification defines a standard REST API for atomistic structures and related properties. However, deploying and maintaining compliant services remains technically demanding and poses a significant barrier for many data providers. Here, we present optimade-maker, a lightweight toolkit for the automated generation of OPTIMADE-compliant APIs directly from raw atomistic structure and property data. The toolkit supports a wide range of raw datasets, enables conversion to a standardised OPTIMADE data representation, and allows for rapid deployment of APIs in both local and production environments. We further demonstrate it through an automated service on the Materials Cloud Archive, which automatically creates and publishes OPTIMADE APIs for contributed datasets, enabling immediate discoverability and interoperability. In addition, we implement data transformation pipelines for the Cambridge Structural Database (CSD) and the Inorganic Crystal Structure Database (ICSD), enabling unified access to these curated resources through the OPTIMADE framework. By lowering the technical barriers to interoperable data publication, optimade-maker represents an important step toward a scalable, FAIR materials data ecosystem integrating both community-contributed and curated databases.

COMP-PHApr 5, 2015Code
AiiDA: Automated Interactive Infrastructure and Database for Computational Science

Giovanni Pizzi, Andrea Cepellotti, Riccardo Sabatini et al.

Computational science has seen in the last decades a spectacular rise in the scope, breadth, and depth of its efforts. Notwithstanding this prevalence and impact, it is often still performed using the renaissance model of individual artisans gathered in a workshop, under the guidance of an established practitioner. Great benefits could follow instead from adopting concepts and tools coming from computer science to manage, preserve, and share these computational efforts. We illustrate here our paradigm sustaining such vision, based around the four pillars of Automation, Data, Environment, and Sharing. We then discuss its implementation in the open-source AiiDA platform (http://www.aiida.net), that has been tuned first to the demands of computational materials science. AiiDA's design is based on directed acyclic graphs to track the provenance of data and calculations, and ensure preservation and searchability. Remote computational resources are managed transparently, and automation is coupled with data storage to ensure reproducibility. Last, complex sequences of calculations can be encoded into scientific workflows. We believe that AiiDA's design and its sharing capabilities will encourage the creation of social ecosystems to disseminate codes, data, and scientific workflows.