LGQMJul 16, 2024

Molecular Topological Profile (MOLTOP) -- Simple and Strong Baseline for Molecular Graph Classification

arXiv:2407.12136v36 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work provides a strong, simple baseline for evaluating Graph Neural Networks in molecular graph classification, which is incremental but crucial for accurate assessment in the domain.

The authors tackled molecular graph classification by designing a simple baseline using topological descriptors and a Random Forest classifier, achieving competitive performance compared to modern Graph Neural Networks across eleven benchmark datasets, even surpassing the 1-WL and 3-WL tests in some cases.

We revisit the effectiveness of topological descriptors for molecular graph classification and design a simple, yet strong baseline. We demonstrate that a simple approach to feature engineering - employing histogram aggregation of edge descriptors and one-hot encoding for atomic numbers and bond types - when combined with a Random Forest classifier, can establish a strong baseline for Graph Neural Networks (GNNs). The novel algorithm, Molecular Topological Profile (MOLTOP), integrates Edge Betweenness Centrality, Adjusted Rand Index and SCAN Structural Similarity score. This approach proves to be remarkably competitive when compared to modern GNNs, while also being simple, fast, low-variance and hyperparameter-free. Our approach is rigorously tested on MoleculeNet datasets using fair evaluation protocol provided by Open Graph Benchmark. We additionally show out-of-domain generation capabilities on peptide classification task from Long Range Graph Benchmark. The evaluations across eleven benchmark datasets reveal MOLTOP's strong discriminative capabilities, surpassing the $1$-WL test and even $3$-WL test for some classes of graphs. Our conclusion is that descriptor-based baselines, such as the one we propose, are still crucial for accurately assessing advancements in the GNN domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes