Characterization of prokaryotic and eukaryotic promoters using hidden Markov models

Anders Gorm Pedersen, P. Baldi, Y. Chauvin, Søren Brunak

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Abstract

    In this paper we utilize hidden Markov models (HMMs) and information theory to analyze prokaryotic and eukaryotic promoters. We perform this analysis with special emphasis on the fact that promoters are divided into a number of different classes, depending on which polymerase-associated factors that bind to them. We find that HMMs trained on such subclasses of Escherichia coli promoters (specifically, the so-called sigma 70 and sigma 54 classes) give an excellent classification of unknown promoters with respect to sigma-class. HMMs trained on eukaryotic sequences from human genes also model nicely all the essential well known signals, in addition to a potentially new signal upstream of the TATA-box. We furthermore employ a novel technique for automatically discovering different classes in the input data (the promoters) using a system of self-organizing parallel HMMs. These self-organizing HMMs have at the same time the ability to find clusters and the ability to model the sequential structure in the input data. This is highly relevant in situations where the variance in the data is high, as is the case for the subclass structure in for example promoter sequences.
    Original languageEnglish
    Title of host publicationISMB-96 Proceedings
    Number of pages10
    Volume4
    PublisherAAAI Press
    Publication date1996
    Pages182-191
    Publication statusPublished - 1996
    Event4th International Conference on Intelligent Systems for Molecular Biology - Washington, St. Louis, United States
    Duration: 12 Jun 199615 Jun 1996
    Conference number: 4
    https://www.iscb.org/cms_addon/conferences/ismb1996/ISMB-96%20Home%20Page.htm

    Conference

    Conference4th International Conference on Intelligent Systems for Molecular Biology
    Number4
    LocationWashington
    Country/TerritoryUnited States
    CitySt. Louis
    Period12/06/199615/06/1996
    Internet address
    SeriesInternational Conference on Intelligent Systems for Molecular Biology. Proceedings
    ISSN1553-0833

    Keywords

    • article
    • Escherichia coli
    • gene library
    • human
    • probability
    • promoter region
    • Genomic Library
    • Humans
    • Markov Chains
    • Promoter Regions (Genetics)

    Fingerprint

    Dive into the research topics of 'Characterization of prokaryotic and eukaryotic promoters using hidden Markov models'. Together they form a unique fingerprint.

    Cite this