Fast Arc-Annotated Subsequence Matching in Linear Space

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Abstract

    An arc-annotated string is a string of characters, called bases, augmented with a set; of pairs, called arcs, each connecting two bases. Given arc-annotated strings P and Q the arc-preserving subsequence problem is to determine if P can be obtained from Q by deleting bases from Q. Whenever a base is deleted any arc with an endpoint in that base is also deleted. Arc-annotated strings where the arcs are "nested" are a natural model of RNA molecules that captures both the primary and secondary structure of these. The arc-preserving subsequence problem for nested arc-annotated strings is basic primitive for investigating the function of RNA molecules. Gramm et al. [ACM Trans. Algorithms 2006] gave an algorithm for this problem using O(nm) time and space, where m and n are the lengths of P and Q, respectively. In this paper we present; a new algorithm using O(nm) time and O(n+m,) space, thereby matching the previous time bound while significantly reducing the space from a quadratic term to linear. This is essential to process large RNA molecules where the space is a likely to be a bottleneck. To obtain our result we introduce several novel ideas which may be of independent interest for related problems on arc-annotated strings.
    Original languageEnglish
    Title of host publicationSOFSEM 2010: Theory and Practice of Computer Science
    PublisherSpringer
    Publication date2010
    Pages188-199
    ISBN (Print)978-3-642-11265-2
    DOIs
    Publication statusPublished - 2010
    EventSOFSEM 2010: Theory and Practice of Computer Science, 36th Conference on Current Trends in Theory and Practice of Computer Science - Spindleruv Mlýn, Czech Republic
    Duration: 23 Jan 201029 Jan 2010
    Conference number: 36

    Conference

    ConferenceSOFSEM 2010: Theory and Practice of Computer Science, 36th Conference on Current Trends in Theory and Practice of Computer Science
    Number36
    CountryCzech Republic
    CitySpindleruv Mlýn
    Period23/01/201029/01/2010
    SeriesLecture Notes in Computer Science
    Volume5901
    ISSN0302-9743

    Fingerprint Dive into the research topics of 'Fast Arc-Annotated Subsequence Matching in Linear Space'. Together they form a unique fingerprint.

    Cite this