Machine learning approaches for the prediction of signal peptides and otherprotein sorting signals

Henrik Nielsen, Søren Brunak, Gunnar von Heijne

    Research output: Contribution to journalJournal articleResearchpeer-review


    Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently,the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, havemade it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In thisreview, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of themost well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomicsequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons andtransmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possibleto discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used tocharacterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods forpredicting other protein sorting signals and discuss the future of protein sorting prediction in general.
    Original languageEnglish
    JournalProtein Engineering
    Issue number1
    Pages (from-to)3-9
    Publication statusPublished - 1999

    Cite this