Sharing programming resources between bio* projects

Raoul J. P. Bonnal, Andrew Yates, Naohisa Goto, Laurent Gautier, Scooter Willis, Christopher Fields, Toshiaki Katayama, Pjotr Prins*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

203 Downloads (Pure)

Abstract

Open-source software encourages computer programmers to reuse software components written by others. In evolutionary bioinformatics, open-source software comes in a broad range of programming languages, including C/C++, Perl, Python, Ruby, Java, and R. To avoid writing the same functionality multiple times for different languages, it is possible to share components by bridging computer languages and Bio* projects, such as BioPerl, Biopython, BioRuby, BioJava, and R/Bioconductor. In this chapter, we compare the three principal approaches for sharing software between different programming languages: By remote procedure call (RPC), by sharing a local “call stack,” and by calling program to programs. RPC provides a language-independent protocol over a network interface; examples are SOAP and Rserve. The local call stack provides a between-language mapping, not over the network interface but directly in computer memory; examples are R bindings, RPy, and languages sharing the Java virtual machine stack. This functionality provides strategies for sharing of software between Bio* projects, which can be exploited more often. Here, we present cross-language examples for sequence translation and measure throughput of the different options. We compare calling into R through native R, RSOAP, Rserve, and RPy interfaces, with the performance of native BioPerl, Biopython, BioJava, and BioRuby implementations and with call stack bindings to BioJava and the European Molecular Biology Open Software Suite (EMBOSS). In general, call stack approaches outperform native Bio* implementations, and these, in turn, outperform “RPC”-based approaches. To test and compare strategies, we provide a downloadable Docker container with all examples, tools, and libraries included.

Original languageEnglish
Title of host publicationEvolutionary Genomics: Statistical and Computational Methods
EditorsMaria Anisimova
Number of pages20
PublisherSpringer
Publication date2019
Pages747-766
Chapter25
ISBN (Print)978-1-4939-9073-3
ISBN (Electronic)978-1-4939-9074-0
DOIs
Publication statusPublished - 2019
SeriesMethods in Molecular Biology
Volume1910
ISSN1064-3745

Keywords

  • Bioinformatics
  • EMBOSS
  • Java
  • PAML
  • Perl
  • Python
  • R
  • RPC
  • Ruby
  • Web services

Fingerprint

Dive into the research topics of 'Sharing programming resources between bio* projects'. Together they form a unique fingerprint.

Cite this