InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments

Kevin Eloff*, Konstantinos Kalogeropoulos*, Amandla Mabona, Oliver Morell, Rachel Catzel, Esperanza Rivera-de-Torre, Jakob Berg Jespersen, Wesley Williams, Sam P.B. van Beljouw, Marcin J. Skwark, Andreas Hougaard Laustsen, Stan J.J. Brouns, Anne Ljungars, Erwin M. Schoof, Jeroen Van Goey, Ulrich auf dem Keller, Karim Beguir, Nicolas Lopez Carranza, Timothy P. Jenkins*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

14 Downloads (Orbit)

Abstract

Mass spectrometry-based proteomics focuses on identifying the peptide that generates a tandem mass spectrum. Traditional methods rely on protein databases but are often limited or inapplicable in certain contexts. De novo peptide sequencing, which assigns peptide sequences to spectra without prior information, is valuable for diverse biological applications; however, owing to a lack of accuracy, it remains challenging to apply. Here we introduce InstaNovo, a transformer model that translates fragment ion peaks into peptide sequences. We demonstrate that InstaNovo outperforms state-of-the-art methods and showcase its utility in several applications. We also introduce InstaNovo+, a diffusion model that improves performance through iterative refinement of predicted sequences. Using these models, we achieve improved therapeutic sequencing coverage, discover novel peptides and detect unreported organisms in diverse datasets, thereby expanding the scope and detection rate of proteomics searches. Our models unlock opportunities across domains such as direct protein sequencing, immunopeptidomics and exploration of the dark proteome.
Original languageEnglish
JournalNature Machine Intelligence
Volume7
Pages (from-to)565-579
DOIs
Publication statusPublished - 2025

Fingerprint

Dive into the research topics of 'InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments'. Together they form a unique fingerprint.

Cite this