SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments

Leon Ivar Jessen, Ilka Hoof, Ole Lund, Morten Nielsen

    Research output: Contribution to journalJournal articleResearchpeer-review

    426 Downloads (Pure)

    Abstract

    Identifying which mutation(s) within a given genotype is responsible for an observable phenotype is important in many aspects of molecular biology. Here, we present SigniSite, an online application for subgroup-free residue-level genotype–phenotype correlation. In contrast to similar methods, SigniSite does not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype. SigniSite will then identify which amino acid residues are significantly associated with the data set phenotype. As output, SigniSite displays a sequence logo, depicting the strength of the phenotype association of each residue and a heat-map identifying ‘hot’ or ‘cold’ regions. SigniSite was benchmarked against SPEER, a state-of-the-art method for the prediction of specificity determining positions (SDP) using a set of human immunodeficiency virus protease-inhibitor genotype–phenotype data and corresponding resistance mutation scores from the Stanford University HIV Drug Resistance Database, and a data set of protein families with experimentally annotated SDPs. For both data sets, SigniSite was found to outperform SPEER. SigniSite is available at: http://www.cbs.dtu.dk/services/SigniSite/.
    Original languageEnglish
    JournalNucleic acids research
    Volume41
    Issue numberW1
    Pages (from-to)20W286-W291
    ISSN0305-1048
    DOIs
    Publication statusPublished - 2013

    Fingerprint

    Dive into the research topics of 'SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments'. Together they form a unique fingerprint.

    Cite this