Mining, analyzing, and integrating viral signals from metagenomic data

Tingting Zheng, Jun Li, Yueqiong Ni, Kang Kang, Maria-Anna Misiakou, Lejla Imamovic, Billy K C Chow, Anne A Rode, Peter Bytzer, Morten Otto Alexander Sommer*, Gianni Panagiotou

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

98 Downloads (Pure)

Abstract

Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora. VirMiner achieved 41.06% ± 17.51% sensitivity and 81.91% ± 4.04% specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23% ± 16.94%) than VirFinder (34.63% ± 17.96%) and VirSorter (18.75% ± 15.23%) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/ . We developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder-the most widely used tools-VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics. The European Union Clinical Trials Register, EudraCT Number: 2013-003378-28 . Registered on 9 April 2014.
Original languageEnglish
Article number42
JournalMicrobiome
Volume7
Issue number1
Number of pages15
ISSN2049-2618
DOIs
Publication statusPublished - 2019

Keywords

  • Antibiotics
  • Metagenome
  • Phage
  • Phage-host interaction

Cite this

Zheng, Tingting ; Li, Jun ; Ni, Yueqiong ; Kang, Kang ; Misiakou, Maria-Anna ; Imamovic, Lejla ; Chow, Billy K C ; Rode, Anne A ; Bytzer, Peter ; Sommer, Morten Otto Alexander ; Panagiotou, Gianni. / Mining, analyzing, and integrating viral signals from metagenomic data. In: Microbiome. 2019 ; Vol. 7, No. 1.
@article{60b5441257864b19a00ed77a10685c33,
title = "Mining, analyzing, and integrating viral signals from metagenomic data",
abstract = "Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora. VirMiner achieved 41.06{\%} ± 17.51{\%} sensitivity and 81.91{\%} ± 4.04{\%} specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23{\%} ± 16.94{\%}) than VirFinder (34.63{\%} ± 17.96{\%}) and VirSorter (18.75{\%} ± 15.23{\%}) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/ . We developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder-the most widely used tools-VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics. The European Union Clinical Trials Register, EudraCT Number: 2013-003378-28 . Registered on 9 April 2014.",
keywords = "Antibiotics, Metagenome, Phage, Phage-host interaction",
author = "Tingting Zheng and Jun Li and Yueqiong Ni and Kang Kang and Maria-Anna Misiakou and Lejla Imamovic and Chow, {Billy K C} and Rode, {Anne A} and Peter Bytzer and Sommer, {Morten Otto Alexander} and Gianni Panagiotou",
year = "2019",
doi = "10.1186/s40168-019-0657-y",
language = "English",
volume = "7",
journal = "Microbiome",
issn = "2049-2618",
publisher = "BioMed Central Ltd.",
number = "1",

}

Zheng, T, Li, J, Ni, Y, Kang, K, Misiakou, M-A, Imamovic, L, Chow, BKC, Rode, AA, Bytzer, P, Sommer, MOA & Panagiotou, G 2019, 'Mining, analyzing, and integrating viral signals from metagenomic data', Microbiome, vol. 7, no. 1, 42. https://doi.org/10.1186/s40168-019-0657-y

Mining, analyzing, and integrating viral signals from metagenomic data. / Zheng, Tingting; Li, Jun; Ni, Yueqiong; Kang, Kang; Misiakou, Maria-Anna; Imamovic, Lejla; Chow, Billy K C; Rode, Anne A; Bytzer, Peter; Sommer, Morten Otto Alexander; Panagiotou, Gianni.

In: Microbiome, Vol. 7, No. 1, 42, 2019.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Mining, analyzing, and integrating viral signals from metagenomic data

AU - Zheng, Tingting

AU - Li, Jun

AU - Ni, Yueqiong

AU - Kang, Kang

AU - Misiakou, Maria-Anna

AU - Imamovic, Lejla

AU - Chow, Billy K C

AU - Rode, Anne A

AU - Bytzer, Peter

AU - Sommer, Morten Otto Alexander

AU - Panagiotou, Gianni

PY - 2019

Y1 - 2019

N2 - Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora. VirMiner achieved 41.06% ± 17.51% sensitivity and 81.91% ± 4.04% specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23% ± 16.94%) than VirFinder (34.63% ± 17.96%) and VirSorter (18.75% ± 15.23%) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/ . We developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder-the most widely used tools-VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics. The European Union Clinical Trials Register, EudraCT Number: 2013-003378-28 . Registered on 9 April 2014.

AB - Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora. VirMiner achieved 41.06% ± 17.51% sensitivity and 81.91% ± 4.04% specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23% ± 16.94%) than VirFinder (34.63% ± 17.96%) and VirSorter (18.75% ± 15.23%) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/ . We developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder-the most widely used tools-VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics. The European Union Clinical Trials Register, EudraCT Number: 2013-003378-28 . Registered on 9 April 2014.

KW - Antibiotics

KW - Metagenome

KW - Phage

KW - Phage-host interaction

U2 - 10.1186/s40168-019-0657-y

DO - 10.1186/s40168-019-0657-y

M3 - Journal article

VL - 7

JO - Microbiome

JF - Microbiome

SN - 2049-2618

IS - 1

M1 - 42

ER -