De-novo Identification of Small Molecules from Their GC-EI-MS Spectra
This work addresses a specific challenge in analytical chemistry for researchers needing to identify unknown compounds, but it appears incremental as it builds on prior de-novo methods by focusing on a harder use case.
The paper tackles the problem of identifying unknown small molecules from GC-EI-MS spectra, where existing databases are insufficient, by proposing a novel de-novo machine learning method that addresses the challenge of lacking MS/MS data, resulting in a method that analyzes its own strengths and drawbacks without providing concrete numerical results.
Identification of experimentally acquired mass spectra of unknown compounds presents a~particular challenge because reliable spectral databases do not cover the potential chemical space with sufficient density. Therefore machine learning based \emph{de-novo} methods, which derive molecular structure directly from its mass spectrum gained attention recently. We present a~novel method in this family, addressing a~specific usecase of GC-EI-MS spectra, which is particularly hard due to lack of additional information from the first stage of MS/MS experiments, on which the previously published methods rely. We analyze strengths and drawbacks or our approach and discuss future directions.