From LZ77 to the run-length encoded burrows-wheeler transform, and back

Alberto Policriti, Nicola Prezza

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The Lempel-Ziv factorization (LZ77) and the Run-Length encoded Burrows-Wheeler Transform (RLBWT) are two important tools in text compression and indexing, being their sizes z and r closely related to the amount of text self-repetitiveness. In this paper we consider the problem of converting the two representations into each other within a working space proportional to the input and the output. Let n be the text length. We show that RLBWT can be converted to LZ77 in O(n log r) time and O(r) words of working space. Conversely, we provide an algorithm to convert LZ77 to RLBWT in O(n(log r + log z)) time and O(r + z) words of working space. Note that r and z can be constant if the text is highly repetitive, and our algorithms can operate with (up to) exponentially less space than naive solutions based on full decompression.
Original languageEnglish
Title of host publicationProceedings of the 28th Annual Symposium on Combinatorial Pattern Matching
EditorsK. Juha, Jakub Radozewski, Wojciech Rytter
Number of pages10
Volume78
Place of PublicationDagstuhl, Germany
PublisherSchloß Dagstuhl
Publication date2017
Article number17
DOIs
Publication statusPublished - 2017
Event28th Annual Symposium on Combinatorial Pattern Matching - University of Warsaw Library, Warsaw, Poland
Duration: 4 Jul 20176 Jul 2017

Conference

Conference28th Annual Symposium on Combinatorial Pattern Matching
LocationUniversity of Warsaw Library
Country/TerritoryPoland
CityWarsaw
Period04/07/201706/07/2017
SeriesLeibniz International Proceedings in Informatics
ISSN1868-8969

Keywords

  • Burrows-Wheeler transform
  • Compressed computation
  • Lempel-Ziv
  • Repetitive text collections

Fingerprint

Dive into the research topics of 'From LZ77 to the run-length encoded burrows-wheeler transform, and back'. Together they form a unique fingerprint.

Cite this