We examined more than 700 DNA sequences (full length chromosomes and plasmids) for stretches of purines (R) or pyrimidines (Y) and alternating YR stretches; such regions will likely adopt structures which are different from the canonical B-form. Since one turn of the DNA helix is roughly 10 bp, we measured the fraction of each genome which contains purine (or pyrimidine) tracts of lengths of 10 by or longer (hereafter referred to as 'purine tracts'), as well as stretches of alternating pyrimidines/purine ('pyr/pur tracts') of the same length. Using this criteria, a random sequence would be expected to contain 1.0% of purine tracts and also 1.0% of the alternating pyr/pur tracts. In the vast majority of cases, there are more purine tracts than would be expected from a random sequence, with an average of 3.5%, significantly larger than the expectation value. The fraction of the chromosomes containing pyr/pur tracts was slightly less than expected, with an average of 0.8%. One of the most surprising findings is a clear difference in the length distributions of the regions studied between prokaryotes and eukaryotes. Whereas short-range correlations can explain the length distributions in prokaryotes, in eukaryotes there is an abundance of long stretches of purines or alternating purine/pyrimidine tracts, which cannot be explained in this way; these sequences are likely to play an important role in eukaryotic chromosome organisation.
|Journal||Computers & Chemistry|
|Publication status||Published - 2002|