String Indexing for Patterns With Wildcards

Philip Bille, Inge Li Gørtz, Hjalte Wedel Vildhøj, Søren Juhl Vind

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    352 Downloads (Pure)

    Abstract

    We consider the problem of indexing a string t of length n to report the occurrences of a query pattern p containing m characters and j wildcards. Let occ be the number of occurrences of p in t, and σ the size of the alphabet. We obtain the following results. - A linear space index with query time O(m + σj log log n + occ). This significantly improves the previously best known linear space index by Lam et al. [ISAAC 2007], which requires query time Θ(jn) in the worst case. - An index with query time O(m + j + occ) using space O(σk2 n log k log n), where k is the maximum number of wildcards allowed in the pattern. This is the first non-trivial bound with this query time. - A time-space trade-off, generalizing the index by Cole et al. [STOC 2004]. Our results are obtained using a novel combination of well-known and new techniques, which could be of independent interest.
    Original languageEnglish
    Title of host publicationAlgorithm Theory – SWAT 2012 : 13th Scandinavian Symposium and Workshops, Helsinki, Finland, July 4-6, 2012. Proceedings
    EditorsFedor V. Fornin, Petteri Kaski
    PublisherSpringer
    Publication date2012
    Pages283-294
    ISBN (Print)978-3-642-31154-3
    ISBN (Electronic)978-3-642-31155-0
    DOIs
    Publication statusPublished - 2012
    Event13th Scandinavian Symposium and Workshops on Algorithm Theory - University of Helsinki, Helsinki, Finland
    Duration: 4 Jul 20126 Jul 2012

    Conference

    Conference13th Scandinavian Symposium and Workshops on Algorithm Theory
    LocationUniversity of Helsinki
    Country/TerritoryFinland
    CityHelsinki
    Period04/07/201206/07/2012
    SeriesLecture Notes in Computer Science
    Volume7357
    ISSN0302-9743

    Fingerprint

    Dive into the research topics of 'String Indexing for Patterns With Wildcards'. Together they form a unique fingerprint.

    Cite this