Tiered Graph Autoencoders with PyTorch Geometric for Molecular Graphs
This work addresses molecular representation for computational chemistry, but it is incremental as it adapts existing tiered graph autoencoders to PyTorch Geometric.
The authors tackled the problem of representing molecular graphs with tiered latent spaces for groups like functional groups, resulting in a framework that provides tiered latent representations for each molecular graph, enabling exploration and navigation across node, group, and graph tiers.
Tiered latent representations and latent spaces for molecular graphs provide a simple but effective way to explicitly represent and utilize groups (e.g., functional groups), which consist of the atom (node) tier, the group tier and the molecule (graph) tier. They can be learned using the tiered graph autoencoder architecture. In this paper we discuss adapting tiered graph autoencoders for use with PyTorch Geometric, for both the deterministic tiered graph autoencoder model and the probabilistic tiered variational graph autoencoder model. We also discuss molecular structure information sources that can be accessed to extract training data for molecular graphs. To support transfer learning, a critical consideration is that the information must utilize standard unique molecule and constituent atom identifiers. As a result of using tiered graph autoencoders for deep learning, each molecular graph possesses tiered latent representations. At each tier, the latent representation consists of: node features, edge indices, edge features, membership matrix, and node embeddings. This enables the utilization and exploration of tiered molecular latent spaces, either individually (the node tier, the group tier, or the graph tier) or jointly, as well as navigation across the tiers.