Many proteins contain a large number of NXS/T sequences (where X is any amino acid except proline) which are the potential sites of asparagine (N) linked glycosylation. However, the patterns of occurrence of these N-glycosylation sequons in related proteins or groups of proteins and their underlying causes have largely been unexplored. We computed the actual and probabilistic occurrence of NXS/T sequons in ABC protein superfamilies from eight diverse eukaryotic organisms. The ABC proteins contained significantly higher NXS/T sequon numbers compared to respective genome-wide average, but the sequon density was significantly lower owing to the increase in protein size and decrease in sequon specific amino acids. However, mammalian ABC proteins have significantly higher sequon density, and both serine and threonine containing sequons (NXS and NXT) have been positively selected-against the recent findings of only threonine specific Darwinian selection of sequons in proteins. The occurrence of sequons was positively correlated with the frequency of sequon specific amino acids and negatively correlated with proline and the NPS/T sequences. Further, the NPS/T sequences were significantly higher than expected in plant ABC proteins which have the lowest number of NXS/T sequons. Accordingly, compared to overall proteins, N-glycosylation sequons in ABC protein superfamilies have a distinct pattern of occurrence, and the results are discussed in an evolutionary perspective.
- ABC proteins, evolution
- N-glycosylation sequons