Abstract
Motivation. The primary function of DNA is to carry genetic information through the genetic code. DNA, however, contains a variety of other signals related, for instance, to reading frame, codon bias, pairwise codon bias, splice sites and transcription regulation, nucleosome positioning and DNA structure. Here we study the relationship between the genetic code and DNA structure and address two questions. First, to which degree does the degeneracy of the genetic code and the acceptable amino acid substitution patterns allow for the superimposition of DNA structural signals to protein coding sequences? Second, is the origin or evolution of the genetic code likely to have been constrained by DNA structure? Results. We develop an index for code flexibility with respect to DNA structure. Using five different di- or tri-nucleotide models of sequence-dependent DNA structure, we show that the standard genetic code provides a fair level of flexibility at the level of broad amino acid categories. Thus the code generally allows for the superimposition of any structural signal on any protein-coding sequence, through amino acid substitution. The flexibility observed at the level of single amino acids allows only for the superimposition of punctual and loosely positioned signals to conserved amino acid sequences. The degree of flexibility of the genetic code is low or average with respect to several classes of alternative codes. This result is consistent with the view that DNA structure is not likely to have played a significant role in the origin and evolution of the genetic code.
Original language | English |
---|---|
Journal | Bioinformatics |
Volume | 17 |
Issue number | 3 |
Pages (from-to) | 237-248 |
Number of pages | 12 |
ISSN | 1367-4803 |
Publication status | Published - 2001 |
Keywords
- Amino Acids
- Bacterial Proteins
- DNA
- DNA, Bacterial
- Dinucleotide Repeats
- Escherichia coli
- Evolution, Molecular
- Nucleic Acid Conformation
- Sequence Analysis, DNA
- Trinucleotide Repeats
- 9007-49-2 DNA
- amino acid
- amino acid sequence
- amino acid substitution
- article
- codon
- controlled study
- dinucleotide repeat
- DNA structure
- genetic code
- molecular evolution
- nucleosome
- nucleotide sequence
- position
- priority journal
- transcription regulation
- trinucleotide repeat
- bioinformatics
- DNA amino acid composition
- DNA structure-based genetic code flexibility
- genetic code evolution
- Facultatively Anaerobic Gram-Negative Rods Eubacteria Bacteria Microorganisms (Bacteria, Eubacteria, Microorganisms) - Enterobacteriaceae [06702] Escherichia coli
- Organisms (Organisms) - Organisms [00500] organisms
- double-stranded DNA base stacking energy, bendability, position preference, propeller twist angle, protein deformability
- 00530, General biology - Information, documentation, retrieval and computer applications
- 03502, Genetics - General
- 10062, Biochemistry studies - Nucleic acids, purines and pyrimidines
- 31000, Physiology and biochemistry of bacteria
- 31500, Genetics of bacteria and viruses
- Biochemistry and Molecular Biophysics
- Computational Biology
- Computer Applications
- Information Studies
- Molecular Genetics