The primary structure of a protein is the sequence of its amino acids. The secondary structure describes structural properties of the molecule such as which parts of it form sheets, helices or coils. Spacial and other properties are described by the higher order structures. The classification task we are considering here, is to predict the secondary structure from the primary one. To this end we train a Markov model on training data and then use it to classify parts of unknown protein sequences as sheets, helices or coils. We show how to exploit the directional information contained in the Markov model for this task. Classifications that are purely based on statistical models might not always be biologically meaningful. We present combinatorial methods to incorporate biological background knowledge to enhance the prediction performance.
|Title of host publication||Proceedings of 29th Annual Conference of the German Classification Society (GfKl 2005)|
|Publication status||Published - 2004|