Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

Research output: Other contributionNet publication - Internet publication – Annual report year: 2017Research



  • Author:

View graph of relations

Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer. Peptide Pattern Recognition is a useful software for providing comprehensive groups of related sequences from large protein sequence collections.
Original languageEnglish
Publication date2017
Publication statusPublished - 2017

Bibliographical note

The copyright holder for this preprint is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Last modified: 29/08/2017

CitationsWeb of Science® Times Cited: No match on DOI
Download as:
Download as PDF
Select render style:
Download as HTML
Select render style:
Download as Word
Select render style:

Download statistics

No data available

ID: 141973481