Disentangling the Complex Multiplexed DIA Spectra in De Novo Peptide Sequencing
This work addresses challenges in proteomics for researchers using mass spectrometry, but it is incremental as it builds on existing DIA methods with specific improvements.
The paper tackled the problem of using Data-Independent Acquisition (DIA) mass spectrometry data for de novo peptide sequencing, which is hindered by coeluted peptides and noise, and found that DIANovo, a new deep learning method, improves previous systems by a large margin, with DIA outperforming DDA on Orbitrap Astral instruments due to narrow window mode.
Data-Independent Acquisition (DIA) was introduced to improve sensitivity to cover all peptides in a range rather than only sampling high-intensity peaks as in Data-Dependent Acquisition (DDA) mass spectrometry. However, it is not very clear how useful DIA data is for de novo peptide sequencing as the DIA data are marred with coeluted peptides, high noises, and varying data quality. We present a new deep learning method DIANovo, and address each of these difficulties, and improves the previous established systems by a large margin, via equipping the model with a deeper understanding of coeluted DIA spectra. This paper also provides criteria about when DIA data could be used for de novo peptide sequencing and when not to by providing a comparison between DDA and DIA, in both de novo and database search mode. We find that while DIA excels with narrow isolation windows on older-generation instruments, it loses its advantage with wider windows. However, with Orbitrap Astral, DIA consistently outperforms DDA due to narrow window mode enabled. We also provide a theoretical explanation of this phenomenon, emphasizing the critical role of the signal-to-noise profile in the successful application of de novo sequencing.