Fingerprints in compressed strings

Philip Bille*, Inge Li Gørtz, Patrick Hagge Cording, Benjamin Sach, Hjalte Wedel Vildhøj, Søren Vind

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

332 Downloads (Pure)

Abstract

In this paper we show how to construct a data structure for a string S of size N compressed into a context-free grammar of size n that supports efficient Karp–Rabin fingerprint queries to any substring of S. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i,j]. We present the first O(n) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O(log⁡N) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O(log⁡log⁡N) query time. We extend the result to solve the longest common extension problem in query time O(log⁡Nlog⁡ℓ) and O(log⁡ℓlog⁡log⁡ℓ+log⁡log⁡N) for SLPs and Linear SLPs, respectively. Here, ℓ denotes the length of the LCE.

Original languageEnglish
JournalJournal of Computer and System Sciences
Volume86
Pages (from-to)171-180
ISSN0022-0000
DOIs
Publication statusPublished - 1 Jun 2017

Keywords

  • Grammar compressed string
  • Karp–Rabin fingerprints
  • Longest common extensions

Fingerprint

Dive into the research topics of 'Fingerprints in compressed strings'. Together they form a unique fingerprint.

Cite this