HPC-T-Annotator: an HPC tool for de novo transcriptome assembly annotation

Lorenzo Arcioni, Manuel Arcieri, Jessica Di Martino, Franco Liberati, Paolo Bottoni*, Tiziana Castrignanò*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Background: The availability of transcriptomic data for species without a reference genome enables the construction of de novo transcriptome assemblies as alternative reference resources from RNA-Seq data. A transcriptome provides direct information about a species’ protein-coding genes under specific experimental conditions. The de novo assembly process produces a unigenes file in FASTA format, subsequently targeted for the annotation. Homology-based annotation, a method to infer the function of sequences by estimating similarity with other sequences in a reference database, is a computationally demanding procedure. Results: To mitigate the computational burden, we introduce HPC-T-Annotator, a tool for de novo transcriptome homology annotation on high performance computing (HPC) infrastructures, designed for straightforward configuration via a Web interface. Once the configuration data are given, the entire parallel computing software for annotation is automatically generated and can be launched on a supercomputer using a simple command line. The output data can then be easily viewed using post-processing utilities in the form of Python notebooks integrated in the proposed software. Conclusions: HPC-T-Annotator expedites homology-based annotation in de novo transcriptome assemblies. Its efficient parallelization strategy on HPC infrastructures significantly reduces computational load and execution times, enabling large-scale transcriptome analysis and comparison projects, while its intuitive graphical interface extends accessibility to users without IT skills.

Original languageEnglish
Article number272
JournalBMC Bioinformatics
Volume25
Issue number1
Number of pages14
ISSN1471-2105
DOIs
Publication statusPublished - 2024

Keywords

  • Bioinformatics
  • Data-parallelism algorithm
  • High performance computing
  • Transcript annotation

Fingerprint

Dive into the research topics of 'HPC-T-Annotator: an HPC tool for de novo transcriptome assembly annotation'. Together they form a unique fingerprint.

Cite this