Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion

Jose Maria Gonzalez-Izarzugaza, Laurits Skov, Lasse Maretty, Jacob Malte Jensen, Bent Petersen, Jonas Andreas Sibbesen, Siyang Liu, Palle Villesen, Kirstine González-Izarzugaza Belling, Christian Theil Have, Jose Maria Gonzalez-Izarzugaza, Marie Grosjean, Jette Bork-Jensen, Jakob Grove, Thomas D. Als, Shujia Huang, Yuqi Chang, Ruiqi Xu, Weijian Ye, Junhua RaoXiaosen Guo, Jihua Sun, Hongzhi Cao, Chen Ye, Johan van Beusekom, Thomas Espeseth, Esben Flindt, Rune M. Friborg, Anders E. Halager, Stephanie Le Hellard, Christina M. Hultman, Francesco Lescai, Shengting Li, Ole Lund, Peter Løngren, Thomas Mailund, María Luisa Matey-Hernandez, Ole Mors, Christian N. S. Pedersen, Thomas Sicheritz-Pontén, Patrick F. Sullivan, Syed Qaswar Ali Shah, David Westergaard, Rachita Yadav, Ning Li, Xun Xu, Torben Hansen, Anders Krogh, Lars Bolund, Thorkild I. A. Sørensen, Oluf Pedersen, Ramneek Gupta, Simon Rasmussen, Søren Besenbacher, Anders D. Børglum, Jun Wang, Hans Eiberg, Karsten Kristiansen, Søren Brunak, Mikkel Heide Schierup

    Research output: Contribution to journalJournal articleResearchpeer-review

    260 Downloads (Pure)

    Abstract

    The human Y-chromosome does not recombine across its male-specific part and is therefore an excellent marker of human migrations. It also plays an important role in male fertility. However, its evolution is difficult to fully understand because of repetitive sequences, inverted repeats and the potentially large role of gene conversion. Here we perform an evolutionary analysis of 62 Y-chromosomes of Danish descent sequenced using a wide range of library insert sizes and high coverage, thus allowing large regions of these chromosomes to be well assembled. These include 17 father-son pairs, which we use to validate variation calling. Using a recent method that can integrate variants based on both mapping and de novo assembly, we genotype 10898 SNVs and 2903 indels (max length of 27241 bp) in our sample and show by father-son concordance and experimental validation that the non-recurrent SNP and indel variation on the Y chromosome tree is called very accurately. This includes variation called in a 0.9 Mb centromeric heterochromatic region, which is by far the most variable in the Y chromosome. Among the variation is also longer sequence-stretches not present in the reference genome but shared with the chimpanzee Y chromosome. We analyzed 2.7 Mb of large inverted repeats (palindromes) for variation patterns among the two palindrome arms and identified 603 mutation and 416 gene conversions events. We find clear evidence for GC-biased gene conversion in the palindromes (and a balancing AT mutation bias), but irrespective of this, also a strong bias towards gene conversion towards the ancestral state, suggesting that palindromic gene conversion may alleviate Muller’s ratchet. Finally, we also find a large number of large-scale gene duplications and deletions in the palindromic regions (at least 24) and find that such events can consist of complex combinations of simultaneous insertions and deletions of long stretches of the Y chromosome.
    Original languageEnglish
    Article numbere1006834
    JournalP L o S Genetics
    Volume13
    Issue number8
    Number of pages20
    ISSN1553-7390
    DOIs
    Publication statusPublished - 2017

    Cite this

    Gonzalez-Izarzugaza, J. M., Skov, L., Maretty, L., Jensen, J. M., Petersen, B., Andreas Sibbesen, J., ... Schierup, M. H. (2017). Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion. P L o S Genetics, 13(8), [e1006834]. https://doi.org/10.1371/journal.pgen.1006834
    Gonzalez-Izarzugaza, Jose Maria ; Skov, Laurits ; Maretty, Lasse ; Jensen, Jacob Malte ; Petersen, Bent ; Andreas Sibbesen, Jonas ; Liu, Siyang ; Villesen, Palle ; Belling, Kirstine González-Izarzugaza ; Theil Have, Christian ; Gonzalez-Izarzugaza, Jose Maria ; Grosjean, Marie ; Bork-Jensen, Jette ; Grove, Jakob ; Als, Thomas D. ; Huang, Shujia ; Chang, Yuqi ; Xu, Ruiqi ; Ye, Weijian ; Rao, Junhua ; Guo, Xiaosen ; Sun, Jihua ; Cao, Hongzhi ; Ye, Chen ; van Beusekom, Johan ; Espeseth, Thomas ; Flindt, Esben ; Friborg, Rune M. ; Halager, Anders E. ; Le Hellard, Stephanie ; Hultman, Christina M. ; Lescai, Francesco ; Li, Shengting ; Lund, Ole ; Løngren, Peter ; Mailund, Thomas ; Matey-Hernandez, María Luisa ; Mors, Ole ; Pedersen, Christian N. S. ; Sicheritz-Pontén, Thomas ; Sullivan, Patrick F. ; Qaswar Ali Shah, Syed ; Westergaard, David ; Yadav, Rachita ; Li, Ning ; Xu, Xun ; Hansen, Torben ; Krogh, Anders ; Bolund, Lars ; Sørensen, Thorkild I. A. ; Pedersen, Oluf ; Gupta, Ramneek ; Rasmussen, Simon ; Besenbacher, Søren ; Børglum, Anders D. ; Wang, Jun ; Eiberg, Hans ; Kristiansen, Karsten ; Brunak, Søren ; Schierup, Mikkel Heide. / Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion. In: P L o S Genetics. 2017 ; Vol. 13, No. 8.
    @article{9093eec59cc84d5182b2e49aac0215b9,
    title = "Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion",
    abstract = "The human Y-chromosome does not recombine across its male-specific part and is therefore an excellent marker of human migrations. It also plays an important role in male fertility. However, its evolution is difficult to fully understand because of repetitive sequences, inverted repeats and the potentially large role of gene conversion. Here we perform an evolutionary analysis of 62 Y-chromosomes of Danish descent sequenced using a wide range of library insert sizes and high coverage, thus allowing large regions of these chromosomes to be well assembled. These include 17 father-son pairs, which we use to validate variation calling. Using a recent method that can integrate variants based on both mapping and de novo assembly, we genotype 10898 SNVs and 2903 indels (max length of 27241 bp) in our sample and show by father-son concordance and experimental validation that the non-recurrent SNP and indel variation on the Y chromosome tree is called very accurately. This includes variation called in a 0.9 Mb centromeric heterochromatic region, which is by far the most variable in the Y chromosome. Among the variation is also longer sequence-stretches not present in the reference genome but shared with the chimpanzee Y chromosome. We analyzed 2.7 Mb of large inverted repeats (palindromes) for variation patterns among the two palindrome arms and identified 603 mutation and 416 gene conversions events. We find clear evidence for GC-biased gene conversion in the palindromes (and a balancing AT mutation bias), but irrespective of this, also a strong bias towards gene conversion towards the ancestral state, suggesting that palindromic gene conversion may alleviate Muller’s ratchet. Finally, we also find a large number of large-scale gene duplications and deletions in the palindromic regions (at least 24) and find that such events can consist of complex combinations of simultaneous insertions and deletions of long stretches of the Y chromosome.",
    author = "Gonzalez-Izarzugaza, {Jose Maria} and Laurits Skov and Lasse Maretty and Jensen, {Jacob Malte} and Bent Petersen and {Andreas Sibbesen}, Jonas and Siyang Liu and Palle Villesen and Belling, {Kirstine Gonz{\'a}lez-Izarzugaza} and {Theil Have}, Christian and Gonzalez-Izarzugaza, {Jose Maria} and Marie Grosjean and Jette Bork-Jensen and Jakob Grove and Als, {Thomas D.} and Shujia Huang and Yuqi Chang and Ruiqi Xu and Weijian Ye and Junhua Rao and Xiaosen Guo and Jihua Sun and Hongzhi Cao and Chen Ye and {van Beusekom}, Johan and Thomas Espeseth and Esben Flindt and Friborg, {Rune M.} and Halager, {Anders E.} and {Le Hellard}, Stephanie and Hultman, {Christina M.} and Francesco Lescai and Shengting Li and Ole Lund and Peter L{\o}ngren and Thomas Mailund and Matey-Hernandez, {Mar{\'i}a Luisa} and Ole Mors and Pedersen, {Christian N. S.} and Thomas Sicheritz-Pont{\'e}n and Sullivan, {Patrick F.} and {Qaswar Ali Shah}, Syed and David Westergaard and Rachita Yadav and Ning Li and Xun Xu and Torben Hansen and Anders Krogh and Lars Bolund and S{\o}rensen, {Thorkild I. A.} and Oluf Pedersen and Ramneek Gupta and Simon Rasmussen and S{\o}ren Besenbacher and B{\o}rglum, {Anders D.} and Jun Wang and Hans Eiberg and Karsten Kristiansen and S{\o}ren Brunak and Schierup, {Mikkel Heide}",
    year = "2017",
    doi = "10.1371/journal.pgen.1006834",
    language = "English",
    volume = "13",
    journal = "P L o S Genetics",
    issn = "1553-7390",
    publisher = "Public Library of Science",
    number = "8",

    }

    Gonzalez-Izarzugaza, JM, Skov, L, Maretty, L, Jensen, JM, Petersen, B, Andreas Sibbesen, J, Liu, S, Villesen, P, Belling, KG-I, Theil Have, C, Gonzalez-Izarzugaza, JM, Grosjean, M, Bork-Jensen, J, Grove, J, Als, TD, Huang, S, Chang, Y, Xu, R, Ye, W, Rao, J, Guo, X, Sun, J, Cao, H, Ye, C, van Beusekom, J, Espeseth, T, Flindt, E, Friborg, RM, Halager, AE, Le Hellard, S, Hultman, CM, Lescai, F, Li, S, Lund, O, Løngren, P, Mailund, T, Matey-Hernandez, ML, Mors, O, Pedersen, CNS, Sicheritz-Pontén, T, Sullivan, PF, Qaswar Ali Shah, S, Westergaard, D, Yadav, R, Li, N, Xu, X, Hansen, T, Krogh, A, Bolund, L, Sørensen, TIA, Pedersen, O, Gupta, R, Rasmussen, S, Besenbacher, S, Børglum, AD, Wang, J, Eiberg, H, Kristiansen, K, Brunak, S & Schierup, MH 2017, 'Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion', P L o S Genetics, vol. 13, no. 8, e1006834. https://doi.org/10.1371/journal.pgen.1006834

    Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion. / Gonzalez-Izarzugaza, Jose Maria; Skov, Laurits; Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent; Andreas Sibbesen, Jonas; Liu, Siyang; Villesen, Palle ; Belling, Kirstine González-Izarzugaza; Theil Have, Christian; Gonzalez-Izarzugaza, Jose Maria; Grosjean, Marie; Bork-Jensen, Jette; Grove, Jakob; Als, Thomas D.; Huang, Shujia; Chang, Yuqi; Xu, Ruiqi; Ye, Weijian ; Rao, Junhua ; Guo, Xiaosen; Sun, Jihua; Cao, Hongzhi; Ye, Chen; van Beusekom, Johan; Espeseth, Thomas; Flindt, Esben; Friborg, Rune M. ; Halager, Anders E.; Le Hellard, Stephanie; Hultman, Christina M.; Lescai, Francesco; Li, Shengting; Lund, Ole; Løngren, Peter; Mailund, Thomas; Matey-Hernandez, María Luisa; Mors, Ole; Pedersen, Christian N. S.; Sicheritz-Pontén, Thomas; Sullivan, Patrick F.; Qaswar Ali Shah, Syed; Westergaard, David; Yadav, Rachita; Li, Ning; Xu, Xun; Hansen, Torben; Krogh, Anders; Bolund, Lars; Sørensen, Thorkild I. A.; Pedersen, Oluf; Gupta, Ramneek; Rasmussen, Simon; Besenbacher, Søren; Børglum, Anders D.; Wang, Jun; Eiberg, Hans; Kristiansen, Karsten; Brunak, Søren; Schierup, Mikkel Heide.

    In: P L o S Genetics, Vol. 13, No. 8, e1006834, 2017.

    Research output: Contribution to journalJournal articleResearchpeer-review

    TY - JOUR

    T1 - Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion

    AU - Gonzalez-Izarzugaza, Jose Maria

    AU - Skov, Laurits

    AU - Maretty, Lasse

    AU - Jensen, Jacob Malte

    AU - Petersen, Bent

    AU - Andreas Sibbesen, Jonas

    AU - Liu, Siyang

    AU - Villesen, Palle

    AU - Belling, Kirstine González-Izarzugaza

    AU - Theil Have, Christian

    AU - Gonzalez-Izarzugaza, Jose Maria

    AU - Grosjean, Marie

    AU - Bork-Jensen, Jette

    AU - Grove, Jakob

    AU - Als, Thomas D.

    AU - Huang, Shujia

    AU - Chang, Yuqi

    AU - Xu, Ruiqi

    AU - Ye, Weijian

    AU - Rao, Junhua

    AU - Guo, Xiaosen

    AU - Sun, Jihua

    AU - Cao, Hongzhi

    AU - Ye, Chen

    AU - van Beusekom, Johan

    AU - Espeseth, Thomas

    AU - Flindt, Esben

    AU - Friborg, Rune M.

    AU - Halager, Anders E.

    AU - Le Hellard, Stephanie

    AU - Hultman, Christina M.

    AU - Lescai, Francesco

    AU - Li, Shengting

    AU - Lund, Ole

    AU - Løngren, Peter

    AU - Mailund, Thomas

    AU - Matey-Hernandez, María Luisa

    AU - Mors, Ole

    AU - Pedersen, Christian N. S.

    AU - Sicheritz-Pontén, Thomas

    AU - Sullivan, Patrick F.

    AU - Qaswar Ali Shah, Syed

    AU - Westergaard, David

    AU - Yadav, Rachita

    AU - Li, Ning

    AU - Xu, Xun

    AU - Hansen, Torben

    AU - Krogh, Anders

    AU - Bolund, Lars

    AU - Sørensen, Thorkild I. A.

    AU - Pedersen, Oluf

    AU - Gupta, Ramneek

    AU - Rasmussen, Simon

    AU - Besenbacher, Søren

    AU - Børglum, Anders D.

    AU - Wang, Jun

    AU - Eiberg, Hans

    AU - Kristiansen, Karsten

    AU - Brunak, Søren

    AU - Schierup, Mikkel Heide

    PY - 2017

    Y1 - 2017

    N2 - The human Y-chromosome does not recombine across its male-specific part and is therefore an excellent marker of human migrations. It also plays an important role in male fertility. However, its evolution is difficult to fully understand because of repetitive sequences, inverted repeats and the potentially large role of gene conversion. Here we perform an evolutionary analysis of 62 Y-chromosomes of Danish descent sequenced using a wide range of library insert sizes and high coverage, thus allowing large regions of these chromosomes to be well assembled. These include 17 father-son pairs, which we use to validate variation calling. Using a recent method that can integrate variants based on both mapping and de novo assembly, we genotype 10898 SNVs and 2903 indels (max length of 27241 bp) in our sample and show by father-son concordance and experimental validation that the non-recurrent SNP and indel variation on the Y chromosome tree is called very accurately. This includes variation called in a 0.9 Mb centromeric heterochromatic region, which is by far the most variable in the Y chromosome. Among the variation is also longer sequence-stretches not present in the reference genome but shared with the chimpanzee Y chromosome. We analyzed 2.7 Mb of large inverted repeats (palindromes) for variation patterns among the two palindrome arms and identified 603 mutation and 416 gene conversions events. We find clear evidence for GC-biased gene conversion in the palindromes (and a balancing AT mutation bias), but irrespective of this, also a strong bias towards gene conversion towards the ancestral state, suggesting that palindromic gene conversion may alleviate Muller’s ratchet. Finally, we also find a large number of large-scale gene duplications and deletions in the palindromic regions (at least 24) and find that such events can consist of complex combinations of simultaneous insertions and deletions of long stretches of the Y chromosome.

    AB - The human Y-chromosome does not recombine across its male-specific part and is therefore an excellent marker of human migrations. It also plays an important role in male fertility. However, its evolution is difficult to fully understand because of repetitive sequences, inverted repeats and the potentially large role of gene conversion. Here we perform an evolutionary analysis of 62 Y-chromosomes of Danish descent sequenced using a wide range of library insert sizes and high coverage, thus allowing large regions of these chromosomes to be well assembled. These include 17 father-son pairs, which we use to validate variation calling. Using a recent method that can integrate variants based on both mapping and de novo assembly, we genotype 10898 SNVs and 2903 indels (max length of 27241 bp) in our sample and show by father-son concordance and experimental validation that the non-recurrent SNP and indel variation on the Y chromosome tree is called very accurately. This includes variation called in a 0.9 Mb centromeric heterochromatic region, which is by far the most variable in the Y chromosome. Among the variation is also longer sequence-stretches not present in the reference genome but shared with the chimpanzee Y chromosome. We analyzed 2.7 Mb of large inverted repeats (palindromes) for variation patterns among the two palindrome arms and identified 603 mutation and 416 gene conversions events. We find clear evidence for GC-biased gene conversion in the palindromes (and a balancing AT mutation bias), but irrespective of this, also a strong bias towards gene conversion towards the ancestral state, suggesting that palindromic gene conversion may alleviate Muller’s ratchet. Finally, we also find a large number of large-scale gene duplications and deletions in the palindromic regions (at least 24) and find that such events can consist of complex combinations of simultaneous insertions and deletions of long stretches of the Y chromosome.

    U2 - 10.1371/journal.pgen.1006834

    DO - 10.1371/journal.pgen.1006834

    M3 - Journal article

    C2 - 28846694

    VL - 13

    JO - P L o S Genetics

    JF - P L o S Genetics

    SN - 1553-7390

    IS - 8

    M1 - e1006834

    ER -