TY - JOUR

T1 - Fingerprints in compressed strings

AU - Bille, Philip

AU - Gørtz, Inge Li

AU - Cording, Patrick Hagge

AU - Sach, Benjamin

AU - Vildhøj, Hjalte Wedel

AU - Vind, Søren

PY - 2017/6/1

Y1 - 2017/6/1

N2 - In this paper we show how to construct a data structure for a string S of size N compressed into a context-free grammar of size n that supports efficient Karp–Rabin fingerprint queries to any substring of S. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i,j]. We present the first O(n) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O(logN) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O(loglogN) query time. We extend the result to solve the longest common extension problem in query time O(logNlogℓ) and O(logℓloglogℓ+loglogN) for SLPs and Linear SLPs, respectively. Here, ℓ denotes the length of the LCE.

AB - In this paper we show how to construct a data structure for a string S of size N compressed into a context-free grammar of size n that supports efficient Karp–Rabin fingerprint queries to any substring of S. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i,j]. We present the first O(n) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O(logN) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O(loglogN) query time. We extend the result to solve the longest common extension problem in query time O(logNlogℓ) and O(logℓloglogℓ+loglogN) for SLPs and Linear SLPs, respectively. Here, ℓ denotes the length of the LCE.

KW - Grammar compressed string

KW - Karp–Rabin fingerprints

KW - Longest common extensions

U2 - 10.1016/j.jcss.2017.01.002

DO - 10.1016/j.jcss.2017.01.002

M3 - Journal article

AN - SCOPUS:85009813096

SN - 0022-0000

VL - 86

SP - 171

EP - 180

JO - Journal of Computer and System Sciences

JF - Journal of Computer and System Sciences

ER -