Convolutional LSTM Networks for Subcellular Localization of Proteins

Henrik Nielsen, Søren Kaae Sønderby, Casper Kaae Sønderby, Ole Winther

    Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

    647 Downloads (Pure)

    Abstract

    Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biologically relevant knowledge from the LSTM networks.
    Original languageEnglish
    Publication date2015
    Number of pages1
    Publication statusPublished - 2015
    EventFirst Annual Danish Bioinformatics Conference - Odense, Denmark
    Duration: 27 Aug 201527 Nov 2015

    Conference

    ConferenceFirst Annual Danish Bioinformatics Conference
    Country/TerritoryDenmark
    CityOdense
    Period27/08/201527/11/2015

    Fingerprint

    Dive into the research topics of 'Convolutional LSTM Networks for Subcellular Localization of Proteins'. Together they form a unique fingerprint.

    Cite this