A generic deep convolutional neural network framework for prediction of receptor–ligand interactions—NetPhosPan: application to kinase phosphorylation prediction

Emilio Fenoy, Jose Maria Gonzalez-Izarzugaza, Vanessa Jurtz, Søren Brunak, Morten Nielsen*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

62 Downloads (Pure)

Abstract

Motivation: Understanding the specificity of protein receptor-ligand interactions is pivotal for our comprehension of biological mechanisms and systems. Receptor protein families often have a certain level of sequence diversity that converges into fewer conserved protein structures, allowing the exertion of well-defined functions. T and B cell receptors of the immune system and protein kinases that control the dynamic behaviour and decision processes in eukaryotic cells by catalysing phosphorylation represent prime examples. Driven by the large sequence diversity, the receptors within such protein families are often found to share specificities although divergent at the sequence level. This observation has led to the notion that prediction models of such systems are most effectively handled in a receptor-specific manner.
Results: We show that this approach in many cases is suboptimal, and describe an alternative improved framework for generating models with pan-receptor predictive power for receptor protein families. The framework is based on deep artificial neural networks and integrates information from individual receptors into a single pan-receptor model, leveraging information across multiple receptor-specific data sets allowing predictions of the receptor specificity for all members of a given protein family including those described by limited or no ligand data. The approach was applied to the protein kinase superfamily, leading to the method NetPhosPan. The method was extensively validated and benchmarked against state-of-the-art prediction methods and was found to have unprecedented performance in particularly for kinase domains characterized by limited or no experimental data.
Original languageEnglish
JournalBioinformatics
Volume35
Issue number7
Pages (from-to)1098-1107
ISSN1367-4803
DOIs
Publication statusPublished - 2019

Cite this

@article{f5c09b5418554a0e90f3a6084b1b37d4,
title = "A generic deep convolutional neural network framework for prediction of receptor–ligand interactions—NetPhosPan: application to kinase phosphorylation prediction",
abstract = "Motivation: Understanding the specificity of protein receptor-ligand interactions is pivotal for our comprehension of biological mechanisms and systems. Receptor protein families often have a certain level of sequence diversity that converges into fewer conserved protein structures, allowing the exertion of well-defined functions. T and B cell receptors of the immune system and protein kinases that control the dynamic behaviour and decision processes in eukaryotic cells by catalysing phosphorylation represent prime examples. Driven by the large sequence diversity, the receptors within such protein families are often found to share specificities although divergent at the sequence level. This observation has led to the notion that prediction models of such systems are most effectively handled in a receptor-specific manner.Results: We show that this approach in many cases is suboptimal, and describe an alternative improved framework for generating models with pan-receptor predictive power for receptor protein families. The framework is based on deep artificial neural networks and integrates information from individual receptors into a single pan-receptor model, leveraging information across multiple receptor-specific data sets allowing predictions of the receptor specificity for all members of a given protein family including those described by limited or no ligand data. The approach was applied to the protein kinase superfamily, leading to the method NetPhosPan. The method was extensively validated and benchmarked against state-of-the-art prediction methods and was found to have unprecedented performance in particularly for kinase domains characterized by limited or no experimental data.",
author = "Emilio Fenoy and Gonzalez-Izarzugaza, {Jose Maria} and Vanessa Jurtz and S{\o}ren Brunak and Morten Nielsen",
year = "2019",
doi = "10.1093/bioinformatics/bty715",
language = "English",
volume = "35",
pages = "1098--1107",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "7",

}

A generic deep convolutional neural network framework for prediction of receptor–ligand interactions—NetPhosPan: application to kinase phosphorylation prediction. / Fenoy, Emilio; Gonzalez-Izarzugaza, Jose Maria; Jurtz, Vanessa ; Brunak, Søren; Nielsen, Morten.

In: Bioinformatics, Vol. 35, No. 7, 2019, p. 1098-1107.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - A generic deep convolutional neural network framework for prediction of receptor–ligand interactions—NetPhosPan: application to kinase phosphorylation prediction

AU - Fenoy, Emilio

AU - Gonzalez-Izarzugaza, Jose Maria

AU - Jurtz, Vanessa

AU - Brunak, Søren

AU - Nielsen, Morten

PY - 2019

Y1 - 2019

N2 - Motivation: Understanding the specificity of protein receptor-ligand interactions is pivotal for our comprehension of biological mechanisms and systems. Receptor protein families often have a certain level of sequence diversity that converges into fewer conserved protein structures, allowing the exertion of well-defined functions. T and B cell receptors of the immune system and protein kinases that control the dynamic behaviour and decision processes in eukaryotic cells by catalysing phosphorylation represent prime examples. Driven by the large sequence diversity, the receptors within such protein families are often found to share specificities although divergent at the sequence level. This observation has led to the notion that prediction models of such systems are most effectively handled in a receptor-specific manner.Results: We show that this approach in many cases is suboptimal, and describe an alternative improved framework for generating models with pan-receptor predictive power for receptor protein families. The framework is based on deep artificial neural networks and integrates information from individual receptors into a single pan-receptor model, leveraging information across multiple receptor-specific data sets allowing predictions of the receptor specificity for all members of a given protein family including those described by limited or no ligand data. The approach was applied to the protein kinase superfamily, leading to the method NetPhosPan. The method was extensively validated and benchmarked against state-of-the-art prediction methods and was found to have unprecedented performance in particularly for kinase domains characterized by limited or no experimental data.

AB - Motivation: Understanding the specificity of protein receptor-ligand interactions is pivotal for our comprehension of biological mechanisms and systems. Receptor protein families often have a certain level of sequence diversity that converges into fewer conserved protein structures, allowing the exertion of well-defined functions. T and B cell receptors of the immune system and protein kinases that control the dynamic behaviour and decision processes in eukaryotic cells by catalysing phosphorylation represent prime examples. Driven by the large sequence diversity, the receptors within such protein families are often found to share specificities although divergent at the sequence level. This observation has led to the notion that prediction models of such systems are most effectively handled in a receptor-specific manner.Results: We show that this approach in many cases is suboptimal, and describe an alternative improved framework for generating models with pan-receptor predictive power for receptor protein families. The framework is based on deep artificial neural networks and integrates information from individual receptors into a single pan-receptor model, leveraging information across multiple receptor-specific data sets allowing predictions of the receptor specificity for all members of a given protein family including those described by limited or no ligand data. The approach was applied to the protein kinase superfamily, leading to the method NetPhosPan. The method was extensively validated and benchmarked against state-of-the-art prediction methods and was found to have unprecedented performance in particularly for kinase domains characterized by limited or no experimental data.

U2 - 10.1093/bioinformatics/bty715

DO - 10.1093/bioinformatics/bty715

M3 - Journal article

C2 - 30169744

VL - 35

SP - 1098

EP - 1107

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 7

ER -