Characterization of prokaryotic and eukaryotic promoters using hidden Markov models

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedings – Annual report year: 1996Researchpeer-review

View graph of relations

In this paper we utilize hidden Markov models (HMMs) and information theory to analyze prokaryotic and eukaryotic promoters. We perform this analysis with special emphasis on the fact that promoters are divided into a number of different classes, depending on which polymerase-associated factors that bind to them. We find that HMMs trained on such subclasses of Escherichia coli promoters (specifically, the so-called sigma 70 and sigma 54 classes) give an excellent classification of unknown promoters with respect to sigma-class. HMMs trained on eukaryotic sequences from human genes also model nicely all the essential well known signals, in addition to a potentially new signal upstream of the TATA-box. We furthermore employ a novel technique for automatically discovering different classes in the input data (the promoters) using a system of self-organizing parallel HMMs. These self-organizing HMMs have at the same time the ability to find clusters and the ability to model the sequential structure in the input data. This is highly relevant in situations where the variance in the data is high, as is the case for the subclass structure in for example promoter sequences.
Original languageEnglish
Title of host publicationISMB-96 Proceedings
Number of pages10
Volume4
PublisherAAAI Press
Publication date1996
Pages182-191
Publication statusPublished - 1996
EventInternational Conference on Intelligent Systems for Molecular Biology - St. Louis, USA
Duration: 1 Jan 1996 → …
Conference number: 4

Conference

ConferenceInternational Conference on Intelligent Systems for Molecular Biology
Number4
CitySt. Louis, USA
Period01/01/1996 → …
SeriesInternational Conference on Intelligent Systems for Molecular Biology. Proceedings
ISSN1553-0833

    Research areas

  • article, Escherichia coli, gene library, human, probability, promoter region, Genomic Library, Humans, Markov Chains, Promoter Regions (Genetics)

ID: 96748365