A cloud platform for automating and sharing analysis of raw simulation data from high throughput polymer molecular dynamics simulations

arXiv:2208.01692v14 citationsh-index: 125Has Code
Originality Synthesis-oriented
AI Analysis

This platform addresses the bottleneck of inaccessible raw data for the computational materials science community, though it is incremental as it builds on existing database and cloud technologies.

The authors tackled the problem of sharing and analyzing large-scale raw simulation data in computational materials science by developing a cloud-based platform that automates post-processing and enables public access, currently hosting 6286 molecular dynamics trajectories and 5.7 terabytes of data.

Open material databases storing hundreds of thousands of material structures and their corresponding properties have become the cornerstone of modern computational materials science. Yet, the raw outputs of the simulations, such as the trajectories from molecular dynamics simulations and charge densities from density functional theory calculations, are generally not shared due to their huge size. In this work, we describe a cloud-based platform to facilitate the sharing of raw data and enable the fast post-processing in the cloud to extract new properties defined by the user. As an initial demonstration, our database currently includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes and 5.7 terabytes of data. We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract multiple properties from the raw data, using both expert designed functions and machine learning models. The analysis is run automatically with computation in the cloud, and results then populate a database that can be accessed publicly. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Newly analyzed properties will be incorporated into the database. Finally, we create a front-end user interface at https://www.htpmd.matr.io for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the computational materials science community.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes