Abstract
Motivation: Infectious diseases continue to be a leading cause of mortality and pose a significant global health threat. Thus, the development of tools for surveillance and early detection of emerging pathogens is needed.
Results: We introduce PathogenFinder2, a novel, alignment-free, taxonomy-agnostic model for predicting bacterial pathogenic capacity in humans using protein language models. It outperforms previous methods, particularly for novel taxa, and provides interpretable outputs by highlighting proteins most relevant to pathogenic potential. These insights aid the identification of virulence factors, vaccine targets, and infection-related metabolic pathways. Furthermore, we introduce the Bacterial Pathogenic Capacity Landscape, which reveals patterns linked to host condition, infection site, microbial antagonism, and environmental origin.
Availability: The model is freely available online at https://genepi.dk/pathogenfinder2, or as a standalone program (https://github.com/genomicepidemiology/PathogenFinder2).
Supplementary information: Supplementary data are available at Bioinformatics online.
Results: We introduce PathogenFinder2, a novel, alignment-free, taxonomy-agnostic model for predicting bacterial pathogenic capacity in humans using protein language models. It outperforms previous methods, particularly for novel taxa, and provides interpretable outputs by highlighting proteins most relevant to pathogenic potential. These insights aid the identification of virulence factors, vaccine targets, and infection-related metabolic pathways. Furthermore, we introduce the Bacterial Pathogenic Capacity Landscape, which reveals patterns linked to host condition, infection site, microbial antagonism, and environmental origin.
Availability: The model is freely available online at https://genepi.dk/pathogenfinder2, or as a standalone program (https://github.com/genomicepidemiology/PathogenFinder2).
Supplementary information: Supplementary data are available at Bioinformatics online.
| Original language | English |
|---|---|
| Article number | btag129 |
| Journal | Bioinformatics |
| ISSN | 1367-4803 |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'Whole-genome prediction of bacterial pathogenic capacity on novel bacteria using protein language models with PathogenFinder2'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver