SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences

R. Jorquera, C. González, Philip Thomas Lanken Conradsen Clausen, B. Petersen, D. S. Holmes*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

37 Downloads (Pure)

Abstract

Single-exon coding sequences (CDSs), also known as 'single-exon genes' (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL: http://v2.sinex.cl/.
Original languageEnglish
JournalDatabase : the journal of biological databases and curation
Volume2021
Number of pages5
ISSN1758-0463
DOIs
Publication statusAccepted/In press - 2021

Fingerprint Dive into the research topics of 'SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences'. Together they form a unique fingerprint.

Cite this