PanYolo: a pangenome-based deep-learning model for variant calling

Sajad Tavakoli*, Marjan Mansourvar, Rasmus John Normand Frandsen, Christopher Workman

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

68 Downloads (Pure)

Abstract

Genome variations can occur at different levels, such as small variations (SNPs and short Indels) and Structural Variations (SVs) (e.g., long Indels, duplication, and translocation) [1]. Among these variations, SVs and copy number variations (CNVs) are extremely challenging to be detected, while they hugely impact the genome, consequently the function and the phenotype of genes [1]. To date, many methods and algorithms have been developed to discover and characterize variations in the genome. These methods are generally based on the linear alignment of sequencing reads to a reference genome, that suffers from reference bias and burdens computational cost [2-4]. Moreover, these methods are not able to accurately detect variations in complex or repetitive regions such as long indels and CNVs [2-4]. To solve this problem, we have designed a deep learning-based method that uses pangenome data as reference. This method includes five main steps. At first, DeNovo assembly of input reads is prepared, then the assembly is aligned to pangenome reference. Third, handcrafted features are extracted from aligned sequences followed by being converted to very long images (5 × 200000 pixels, covering 1M base pairs). Then, the images are injected to PanYolo model (inspired by Yolo model, a powerful deep learning model). Finally, postprocessing step is applied to the output of PanYolo. Currently, we are working on this method and trying to make it more optimized and accurate.
Original languageEnglish
Title of host publicationDigitally Driven Biotechnology: 4th DTU Bioengineering symposium
Number of pages1
Place of PublicationKgs. Lyngby, Denmark
PublisherDTU Bioengineering
Publication date2023
Pages43-43
Article number14
Publication statusPublished - 2023
Event4th DTU Bioengineering symposium - Kgs. Lyngby, Denmark
Duration: 26 Oct 202326 Oct 2023

Conference

Conference4th DTU Bioengineering symposium
Country/TerritoryDenmark
CityKgs. Lyngby
Period26/10/202326/10/2023

Fingerprint

Dive into the research topics of 'PanYolo: a pangenome-based deep-learning model for variant calling'. Together they form a unique fingerprint.

Cite this