MCGM: Multi-stage Clustered Global Modeling for Long-range Interactions in Molecules
This addresses a key bottleneck in molecular property prediction for computational chemistry, offering a general solution to improve accuracy and efficiency.
The paper tackles the problem of modeling long-range interactions in molecules with geometric graph neural networks, which are limited by locality bias, by introducing MCGM, a plug-and-play module that reduces OE62 energy prediction error by an average of 26.2% and achieves state-of-the-art accuracy on AQM with 17.0 meV for energy and 4.9 meV/Å for forces.
Geometric graph neural networks (GNNs) excel at capturing molecular geometry, yet their locality-biased message passing hampers the modeling of long-range interactions. Current solutions have fundamental limitations: extending cutoff radii causes computational costs to scale cubically with distance; physics-inspired kernels (e.g., Coulomb, dispersion) are often system-specific and lack generality; Fourier-space methods require careful tuning of multiple parameters (e.g., mesh size, k-space cutoff) with added computational overhead. We introduce Multi-stage Clustered Global Modeling (MCGM), a lightweight, plug-and-play module that endows geometric GNNs with hierarchical global context through efficient clustering operations. MCGM builds a multi-resolution hierarchy of atomic clusters, distills global information via dynamic hierarchical clustering, and propagates this context back through learned transformations, ultimately reinforcing atomic features via residual connections. Seamlessly integrated into four diverse backbone architectures, MCGM reduces OE62 energy prediction error by an average of 26.2%. On AQM, MCGM achieves state-of-the-art accuracy (17.0 meV for energy, 4.9 meV/Å for forces) while using 20% fewer parameters than Neural P3M. Code will be made available upon acceptance.