Ever since the signal hypothesis was proposed in 1971, the exact nature of signal peptides has been a focus point of research. The prediction of signal peptides and protein subcellular location from amino acid sequences has been an important problem in bioinformatics since the dawn of this research field, involving many statistical and machine learning technologies. In this review, we provide a historical account of how position-weight matrices, artificial neural networks, hidden Markov models, support vector machines and, lately, deep learning techniques have been used in the attempts to predict where proteins go. Because the secretory pathway was the first one to be studied both experimentally and through bioinformatics, our main focus is on the historical development of prediction methods for signal peptides that target proteins for secretion; prediction methods to identify targeting signals for other cellular compartments are treated in less detail.
- Signal peptides
- Protein sorting