Accurate continuous geographic assignment from low- to high-density SNP data

Gilles Guillot, Hákon Jónsson, Antoine Hinge, Nabil Manchih, Ludovic Orlando

Research output: Contribution to journalJournal articleResearchpeer-review

732 Downloads (Orbit)

Abstract

Motivation: Large-scale genotype datasets can help tracking the dispersal patterns of epidemiological outbreaks and predicting the geographic origins of individuals. This shows direct applications in forensics for profiling both victims and criminals, and in wildlife management, where poaching hotspot areas can be located. Such approaches, however, require fast and accurate geographical assignment methods.
Results: We introduce a novel statistical method for geopositioning individuals of unknown origin from genotypes. Our method is based on a geostatistical model trained with a dataset of georeferenced genotypes. Statistical inference under this model can be implemented within the theoretical framework of Integrated Nested Laplace Approximation (INLA), which represents one of the major recent breakthroughs in statistics, devoid of Monte Carlo simulations. We compare the performance of our method and SPA in a simulation framework. We highlight the accuracy and limits of continuous spatial assignment methods at various scales by analyzing genotype datasets from a diversity of species, including Florida scrub jay birds Aphelocoma coerulescens, Arabidopsis thaliana and humans, representing 41 to 197,146 SNPs. Our method appears to be best tailored for the analysis of medium-size datasets (a few tens of thousands of loci), such as reduced-representation sequencing data that become increasingly available in ecology.
Original languageEnglish
JournalBioinformatics
Volume32
Issue number7
Pages (from-to)1106-1108
ISSN1367-4803
DOIs
Publication statusPublished - 2016

Fingerprint

Dive into the research topics of 'Accurate continuous geographic assignment from low- to high-density SNP data'. Together they form a unique fingerprint.

Cite this