Decompressing lempel-ziv compressed text

Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

We consider the problem of decompressing the Lempel-Ziv 77 representation of a string S of length n using a working space as close as possible to the size z of the input. The folklore solution for the problem runs in O(n) time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size O(z log(n/z)) and then stream S in linear time. In this paper, we show that O(n) time and O(z) working space can be achieved for constant-size alphabets. On general alphabets of size σ, we describe (i) a trade-off achieving O(n log^δ σ) time and O(z log^1-δ σ) space for any 0≤ δ≤ 1, and (ii) a solution achieving O(n) time and O(z log log (n/z)) space. The latter solution, in particular, dominates both folklore algorithms for the problem. Our solutions can, more generally, extract any specified subsequence of S with little overheads on top of the linear running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text.

Original languageEnglish
Title of host publicationProceedings of Data Compression Conference 2020
EditorsAli Bilgin, Michael W. Marcellin, Joan Serra-Sagrista, James A. Storer
PublisherIEEE
Publication dateMar 2020
Pages143-152
Article number9105689
ISBN (Electronic)9781728164571
DOIs
Publication statusPublished - Mar 2020
Event2020 Data Compression Conference, DCC 2020 - Snowbird, United States
Duration: 24 Mar 202027 Mar 2020

Conference

Conference2020 Data Compression Conference, DCC 2020
Country/TerritoryUnited States
CitySnowbird
Period24/03/202027/03/2020
SponsorBrandeis University, Microsoft USA, University of Arizona

Fingerprint

Dive into the research topics of 'Decompressing lempel-ziv compressed text'. Together they form a unique fingerprint.

Cite this