Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

Siyang Liu, Shujia Huang, Junhua Rao, Weijian Ye, Mikkel H. Schierup, Palle Villesen, Xun Xu, Ning Li, Karsten Kristiansen, Thorkild I. A. Sørensen, Torben Hansen, Oluf Pedersen, Søren Brunak, Ramneek Gupta, Simon Rasmussen, Ole Lund, Lars Bolund, Anders D. Børglum, Hans Eiberg, Esben Nørgaard FlindtRuiqi Xu, Jihua Sun, Hao Liu, Hui Jiang, Ou Wang, Xiaofang Cheng, Ditte Demontis, Søren Besenbacher, Thomas Mailund, Rune M. Friborg, Christian N. S. Pedersen, Siyang Liu, Yuqi Chang, Shengting Li, Xiaosen Guo, Hongzhi Cao, Chen Ye, Lasse Maretty, Jonas Andreas Sibbesen, Anders Albrechtsen, Jette Bork-Jensen, Christian Theil Have, Jose Maria Gonzalez-Izarzugaza, Kirstine González-Izarzugaza Belling, Rachita Yadav, Jakob Grove, Thomas Dam-Als, Francesco Lescai, Anders Krogh, Jun Wang

    Research output: Contribution to journalJournal articleResearchpeer-review

    413 Downloads (Pure)

    Abstract

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.
    Original languageEnglish
    JournalGigaScience
    Volume4
    Issue number64
    Number of pages13
    ISSN2047-217X
    DOIs
    Publication statusPublished - 2015

    Bibliographical note

    © 2015 Liu et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

    Keywords

    • Novel sequence
    • Structural variation
    • De novo assembly

    Fingerprint

    Dive into the research topics of 'Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale'. Together they form a unique fingerprint.

    Cite this