Compressed Subsequence Matching and Packed Tree Coloring

Philip Bille, Patrick Hagge Cording, Inge Li Gørtz

Research output: Contribution to journalJournal articleResearchpeer-review

1 Downloads (Pure)

Abstract

We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size n compressing a string of size N and a pattern string of size m over an alphabet of size \(\sigma \), our algorithm uses \(O(n+\frac{n\sigma }{w})\) space and \(O(n+\frac{n\sigma }{w}+m\log N\log w\cdot occ)\) or \(O(n+\frac{n\sigma }{w}\log w+m\log N\cdot occ)\) time. Here w is the word size and occ is the number of minimal occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for \(occ=o(\frac{n}{\log N})\) occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.
Original languageEnglish
JournalAlgorithmica
Volume77
Issue number2
Pages (from-to)336–348
ISSN0178-4617
DOIs
Publication statusPublished - 2017

Keywords

  • Straight line program
  • SLP
  • Compressed
  • Subsequence matching
  • Tree coloring
  • First colored ancestor

Fingerprint

Dive into the research topics of 'Compressed Subsequence Matching and Packed Tree Coloring'. Together they form a unique fingerprint.

Cite this