LGIRQMMar 7

MS2MetGAN: Latent-space adversarial training for metabolite-spectrum matching in MS/MS database search

arXiv:2603.13342h-index: 1
AI Analysis

This work addresses metabolite identification for researchers in metabolomics, representing an incremental improvement through enhanced negative sample generation.

The paper tackled the problem of improving metabolite identification accuracy in MS/MS database search by proposing a new framework for generating negative training samples, resulting in MS2MetGAN achieving better overall performance than existing methods.

Database search is a widely used approach for identifying metabolites from tandem mass spectra (MS/MS). In this strategy, an experimental spectrum is matched against a user-specified database of candidate metabolites, and candidates are ranked such that true metabolite-spectrum matches receive the highest scores. Machine-learning methods have been widely incorporated into database-search-based identification tools and have substantially improved performance. To further improve identification accuracy, we propose a new framework for generating negative training samples. The framework first uses autoencoders to learn latent representations of metabolite structures and MS/MS spectra, thereby recasting metabolite-spectrum matching as matching between latent vectors. It then uses a GAN to generate latent vectors of decoy metabolites and constructs decoy metabolite-spectrum matches as negative samples for training. Experimental results show that our tool, MS2MetGAN, achieves better overall performance than existing metabolite identification methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes