Compressed Indexing with Signature Grammars

Anders Roy Christiansen, Mikko Berggren Ettienne

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

111 Downloads (Pure)


The compressed indexing problem is to preprocess a string S of length n into a compressed representation that supports pattern matching queries. That is, given a string P of length m report all occurrences of P in S.We present a data structure that supports pattern matching queries in O(m + occ(lg lg n + lg(epsilon) z)) time using O(z lg(n/z)) space where z is the size of the LZ77 parse of S and epsilon > 0 is an arbitrarily small constant, when the alphabet is small or z = O(n(1-delta)) for any constant delta > 0. We also present two data structures for the general case; one where the space is increased by O(z lg lg z), and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if P occurs in S in O(m) time using O(z lg(n/z)) space. Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting.
Original languageEnglish
Title of host publicationLATIN 2018: Theoretical Informatics
EditorsMichael A. Bender , Martin Farach-Colton , Miguel A. Mosteiro
Number of pages15
Publication date2018
Publication statusPublished - 2018
Event13th Latin American Theoretical INformatics Symposium - Cultural Center Borges, Buenos Aires, Argentina
Duration: 16 Apr 201819 Apr 2018
Conference number: 13


Conference13th Latin American Theoretical INformatics Symposium
LocationCultural Center Borges
CityBuenos Aires
Internet address
SeriesLecture Notes in Computer Science


Dive into the research topics of 'Compressed Indexing with Signature Grammars'. Together they form a unique fingerprint.

Cite this