MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets

Vanessa Isabell Jurtz, Julia Villarroel, Ole Lund, Mette Voldby Larsen, Morten Nielsen

    Research output: Contribution to journalJournal articleResearchpeer-review

    206 Downloads (Pure)

    Abstract

    Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e. contigs) of phage origin in metage-nomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to outperform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.
    Original languageEnglish
    Article numbere0163111
    JournalP L o S One
    Volume11
    Issue number9
    Number of pages14
    ISSN1932-6203
    DOIs
    Publication statusPublished - 2016

    Bibliographical note

    This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

    Cite this