LGCHEM-PHNov 30, 2020

HydroNet: Benchmark Tasks for Preserving Intermolecular Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data

arXiv:2012.00131v13 citations
Originality Synthesis-oriented
AI Analysis

This work provides a new benchmark dataset for researchers developing predictive and generative machine learning models for molecular data, specifically focusing on the preservation of intermolecular interactions and structural motifs, which is an incremental step in improving model accuracy for chemical problems.

This paper introduces HydroNet, a benchmark dataset of 4.95 million water clusters, to address the challenge of preserving intermolecular interactions and structural motifs in machine learning models for chemical problems. The dataset includes spatial coordinates and two graph representations to facilitate diverse machine learning approaches.

Intermolecular and long-range interactions are central to phenomena as diverse as gene regulation, topological states of quantum materials, electrolyte transport in batteries, and the universal solvation properties of water. We present a set of challenge problems for preserving intermolecular interactions and structural motifs in machine-learning approaches to chemical problems, through the use of a recently published dataset of 4.95 million water clusters held together by hydrogen bonding interactions and resulting in longer range structural patterns. The dataset provides spatial coordinates as well as two types of graph representations, to accommodate a variety of machine-learning practices.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes