Abstract
Database search remains the primary strategy for peptide detection in mass spectrometry-based proteomics, but growing data sets and increasingly expansive peptide search spaces now challenge its computational limits. At the same time, machine learning has transformed multiple aspects of spectrum identification and is increasingly applied directly to peptide-spectrum matching. Neural network models have been proposed as core engines for database search, yet the computational complexities of such approaches have not been systematically defined or compared. Here, we present a range of emerging approaches for database search and a theoretical framework for runtime and scaling in spectrum identification, contrasting classical search strategies with emerging neural network-based methods. We analyze asymptotic complexity in the number of spectra and peptide candidates and estimate practical wall time and memory requirements under realistic hardware assumptions. Our framework highlights trade-offs and provides a guide for selecting and developing scalable peptide search strategies in the era of large models and proteomics data sets. We therefore consider whether learned scoring models may progressively replace or augment classical similarity functions at the peptide-spectrum scoring level.
| Original language | English |
|---|---|
| Journal | Journal of Proteome Research |
| Number of pages | 9 |
| ISSN | 1535-3893 |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
Keywords
- AI models
- Database search
- De novo peptide sequencing
- Hybrid proteomics searches
- Machine learning
- Mass spectrometry
- Peptide spectrum match
- Proteomics
- Reference database
Fingerprint
Dive into the research topics of 'A Framework for Database Search with AI Models in Mass Spectrometry-Based Proteomics'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver