TY - JOUR
T1 - NetMHCphosPan
T2 - Pan-specific prediction of MHC class I antigen presentation of phosphorylated ligands
AU - Refsgaard, Carina Thusgaard
AU - Barra, Carolina
AU - Peng, Xu
AU - Ternette, Nicola
AU - Nielsen, Morten
PY - 2021
Y1 - 2021
N2 - Post-translational modifications of proteins play a crucial part in carcinogenesis. Phosphorylated peptides have shown to be presented by MHC class I molecules and recognised by cytotoxic T cells, making them a promising target for immunotherapy. Identification of phosphorylated MHC class I ligands has so far predominantly been done using bioinformatic tools trained on unmodified peptides. Only one tool, PhosMHCpred, has been developed specifically for the prediction of phosphorylated MHC class I ligands so far and this tool has been trained only on a limited number of alleles and provides a limited peptide length coverage (only including 9-mers). Here we propose a method, termed NetMHCphosPan, for the prediction of MHC presented phosphopeptides. The method is trained using the NNAlign_MA framework, which allows incorporating mixed data types and in- formation leverage between data sets resulting in a greatly improved MHC and peptide length coverage and an overall increased predictive power compared to PhosMHCpred. Motif deconvolution suggested a strong prefer- ence for phosphosites to be located in position 4 of the binding motif, and enrichment of proline at P5 and arginine at P1. The improved performance, driven by the extended length and allelic coverage, of NetMHCphosPan over current state-of-the-art methods, was further validated on a large benchmark data set independent from the model development. In conclusion, we have confirmed the high power of NNAlign_MA for motif deconvolution of complex immuno-peptidomics data and have developed a novel method for prediction of MHC presented phosphopeptides with improved predictive power and a broader peptide length and MHC coverage compared to current state-of- the-art methods. The developed method is available at http://www.cbs.dtu.dk/services/NetMHCphosPan-1.0 .
AB - Post-translational modifications of proteins play a crucial part in carcinogenesis. Phosphorylated peptides have shown to be presented by MHC class I molecules and recognised by cytotoxic T cells, making them a promising target for immunotherapy. Identification of phosphorylated MHC class I ligands has so far predominantly been done using bioinformatic tools trained on unmodified peptides. Only one tool, PhosMHCpred, has been developed specifically for the prediction of phosphorylated MHC class I ligands so far and this tool has been trained only on a limited number of alleles and provides a limited peptide length coverage (only including 9-mers). Here we propose a method, termed NetMHCphosPan, for the prediction of MHC presented phosphopeptides. The method is trained using the NNAlign_MA framework, which allows incorporating mixed data types and in- formation leverage between data sets resulting in a greatly improved MHC and peptide length coverage and an overall increased predictive power compared to PhosMHCpred. Motif deconvolution suggested a strong prefer- ence for phosphosites to be located in position 4 of the binding motif, and enrichment of proline at P5 and arginine at P1. The improved performance, driven by the extended length and allelic coverage, of NetMHCphosPan over current state-of-the-art methods, was further validated on a large benchmark data set independent from the model development. In conclusion, we have confirmed the high power of NNAlign_MA for motif deconvolution of complex immuno-peptidomics data and have developed a novel method for prediction of MHC presented phosphopeptides with improved predictive power and a broader peptide length and MHC coverage compared to current state-of- the-art methods. The developed method is available at http://www.cbs.dtu.dk/services/NetMHCphosPan-1.0 .
KW - MHC antigen presentation
KW - Phosphorylation
KW - Motif deconvolution
KW - T cell epitopes
U2 - 10.1016/j.immuno.2021.100005
DO - 10.1016/j.immuno.2021.100005
M3 - Journal article
SN - 2667-1190
VL - 1-2
JO - Immunoinformatics
JF - Immunoinformatics
M1 - 100005
ER -