LGQMFeb 25, 2025

Neural Graph Matching Improves Retrieval Augmented Generation in Molecular Machine Learning

arXiv:2502.17874v25 citationsh-index: 53Has CodeICML
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing molecular machine learning, specifically for mass spectrum simulation, with a novel hybrid approach that shows strong gains but is incremental in combining existing techniques.

The paper tackles the problem of integrating retrieval-augmented generation into molecular machine learning by proposing MARASON, a model that uses neural graph matching to improve structural alignment of retrieved molecules, achieving 28% top-1 accuracy compared to 19% for non-retrieval state-of-the-art.

Molecular machine learning has gained popularity with the advancements of geometric deep learning. In parallel, retrieval-augmented generation has become a principled approach commonly used with language models. However, the optimal integration of retrieval augmentation into molecular machine learning remains unclear. Graph neural networks stand to benefit from clever matching to understand the structural alignment of retrieved molecules to a query molecule. Neural graph matching offers a compelling solution by explicitly modeling node and edge affinities between two structural graphs while employing a noise-robust, end-to-end neural network to learn affinity metrics. We apply this approach to mass spectrum simulation and introduce MARASON, a novel model that incorporates neural graph matching to enhance a fragmentation-based neural network. Experimental results highlight the effectiveness of our design, with MARASON achieving 28% top-1 accuracy, a substantial improvement over the non-retrieval state-of-the-art accuracy of 19%. Moreover, MARASON outperforms both naive retrieval-augmented generation methods and traditional graph matching approaches. Code is publicly available at https://github.com/coleygroup/ms-pred

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes