TY - JOUR
T1 - euka
T2 - Robust tetrapodic and arthropodic taxa detection from modern and ancient environmental DNA using pangenomic reference graphs
AU - Vogel, Nicola Alexandra
AU - Rubin, Joshua Daniel
AU - Swartz, Mikkel
AU - Vlieghe, Juliette
AU - Sackett, Peter Wad
AU - Pedersen, Anders Gorm
AU - Pedersen, Mikkel Winther
AU - Renaud, Gabriel
N1 - Publisher Copyright:
© 2023 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society.
PY - 2023
Y1 - 2023
N2 - 1. Ancient environmental DNA (aeDNA) is a crucial source of information for past environmental reconstruction. However, the computational analysis of aeDNA involves the inherited challenges of ancient DNA (aDNA) and the typical difficulties of eDNA samples, such as taxonomic identification and abundance estimation of identified taxonomic groups. Current methods for aeDNA fall into those that only perform mapping followed by taxonomic identification and those that purport to do abundance estimation. The former leaves abundance estimates to users, while methods for the latter are not designed for large metagenomic datasets and are often imprecise and challenging to use. 2. Here, we introduce euka, a tool designed for rapid and accurate characterisation of aeDNA samples. We use a taxonomy-based pangenome graph of reference genomes for robustly assigning DNA sequences and use a maximum-likelihood framework for abundance estimation. At the present time, our database is restricted to mitochondrial genomes of tetrapods and arthropods but can be expanded in future versions. 3. We find euka to outperform current taxonomic profiling tools and their abundance estimates. Crucially, we show that regardless of the filtering threshold set by existing methods, euka demonstrates higher accuracy. Furthermore, our approach is robust to sparse data, which is idiosyncratic of aeDNA, detecting a taxon with an average of 50 reads aligning. We also show that euka is consistent with competing tools on empirical samples. 4. euka's features are fine-tuned to deal with the challenges of aeDNA, making it a simple-to-use, all-in-one tool. It is available on GitHub: https://github.com/grenaud/vgan. euka enables researchers to quickly assess and characterise their sample, thus allowing it to be used as a routine screening tool for aeDNA.
AB - 1. Ancient environmental DNA (aeDNA) is a crucial source of information for past environmental reconstruction. However, the computational analysis of aeDNA involves the inherited challenges of ancient DNA (aDNA) and the typical difficulties of eDNA samples, such as taxonomic identification and abundance estimation of identified taxonomic groups. Current methods for aeDNA fall into those that only perform mapping followed by taxonomic identification and those that purport to do abundance estimation. The former leaves abundance estimates to users, while methods for the latter are not designed for large metagenomic datasets and are often imprecise and challenging to use. 2. Here, we introduce euka, a tool designed for rapid and accurate characterisation of aeDNA samples. We use a taxonomy-based pangenome graph of reference genomes for robustly assigning DNA sequences and use a maximum-likelihood framework for abundance estimation. At the present time, our database is restricted to mitochondrial genomes of tetrapods and arthropods but can be expanded in future versions. 3. We find euka to outperform current taxonomic profiling tools and their abundance estimates. Crucially, we show that regardless of the filtering threshold set by existing methods, euka demonstrates higher accuracy. Furthermore, our approach is robust to sparse data, which is idiosyncratic of aeDNA, detecting a taxon with an average of 50 reads aligning. We also show that euka is consistent with competing tools on empirical samples. 4. euka's features are fine-tuned to deal with the challenges of aeDNA, making it a simple-to-use, all-in-one tool. It is available on GitHub: https://github.com/grenaud/vgan. euka enables researchers to quickly assess and characterise their sample, thus allowing it to be used as a routine screening tool for aeDNA.
KW - Ancient environmental DNA
KW - Bayesian
KW - Bioinformatics
KW - Paleoecology
KW - Pangenomics
KW - Software
U2 - 10.1111/2041-210X.14214
DO - 10.1111/2041-210X.14214
M3 - Journal article
AN - SCOPUS:85171897073
SN - 2041-210X
VL - 14
SP - 2717
EP - 2727
JO - Methods in Ecology and Evolution
JF - Methods in Ecology and Evolution
IS - 11
ER -