Deep recurrent conditional random field network for protein secondary prediction

Alexander Rosenberg Johansen, Søren Kaae Sønderby, Casper Kaae Sønderby, Ole Winther

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Deep learning has become the state-of-the-art method for predicting protein secondary structure from only its amino acid residues and sequence profile. Building upon these results, we propose to combine a bi-directional recurrent neural network (biRNN) with a conditional random field (CRF), which we call the biRNN-CRF. The biRNN-CRF may be seen as an improved alternative to an autoregressive uni-directional RNN where predictions are performed sequentially conditioning on the prediction in the previous timestep. The CRF is instead nearest neighbor-aware and models for the joint distribution of the labels for all time-steps. We condition the CRF on the output of biRNN, which learns a distributed representation based on the entire sequence. The biRNN-CRF is therefore close to ideally suited for the secondary structure task because a high degree of cross-talk between neighboring elements can be expected. We validate the model on several benchmark datasets. For example, on CB513, a model with 1.7 million parameters, achieves a Q8 accuracy of 69.4 for single model and 70.9 for ensemble, which to our knowledge is state-of-the-art. 1
Original languageEnglish
Title of host publication8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
Number of pages6
PublisherAssociation for Computing Machinery
Publication date2017
Pages73-78
ISBN (Print)9781450347228
DOIs
Publication statusPublished - 2017
Event8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics - Boston, United States
Duration: 20 Aug 201723 Aug 2017

Conference

Conference8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
CountryUnited States
CityBoston
Period20/08/201723/08/2017
SeriesAcm-bcb - Proc. Acm Int. Conf. Bioinform., Comput. Biol., Health Informatics

Cite this

Johansen, A. R., Sønderby, S. K., Sønderby, C. K., & Winther, O. (2017). Deep recurrent conditional random field network for protein secondary prediction. In 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 73-78). Association for Computing Machinery. Acm-bcb - Proc. Acm Int. Conf. Bioinform., Comput. Biol., Health Informatics https://doi.org/10.1145/3107411.3107489