NMIRacle: Multi-modal Generative Molecular Elucidation from IR and NMR Spectra
This addresses the challenge of molecular structure elucidation for chemists, representing an incremental advancement in AI-driven spectroscopy.
The paper tackles molecular structure elucidation from spectroscopic data by introducing NMIRacle, a two-stage generative framework that bridges fragment-level chemical modeling with spectral evidence, resulting in accurate molecular predictions that outperform existing baselines.
Molecular structure elucidation from spectroscopic data is a long-standing challenge in Chemistry, traditionally requiring expert interpretation. We introduce NMIRacle, a two-stage generative framework that builds upon recent paradigms in AI-driven spectroscopy with minimal assumptions. In the first stage, NMIRacle learns to reconstruct molecular structures from count-aware fragment encodings, which capture both fragment identities and their occurrences. In the second stage, a spectral encoder maps input spectroscopic measurements (IR, 1H-NMR, 13C-NMR) into a latent embedding that conditions the pre-trained generator. This formulation bridges fragment-level chemical modeling with spectral evidence, yielding accurate molecular predictions. Empirical results show that NMIRacle outperforms existing baselines on molecular elucidation, while maintaining robust performance across increasing levels of molecular complexity.