Deciphering the biology of Mycobacterium tuberculosis from thecomplete genome sequence

S.T. Cole, Anders Stærmose Krogh

    Research output: Contribution to journalJournal articleResearchpeer-review


    Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.
    Original languageEnglish
    Issue number6716
    Pages (from-to)537-544
    Publication statusPublished - 1998

    Cite this