Space-Efficient Re-Pair Compression

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Re-Pair [5] is an effective grammar-based compression scheme achieving strong compression rates in practice. Let n, σ, and d be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original paper, the authors show how to compute the Re-Pair grammar in expected linear time and 5n + 4σ2 + 4d + √n words of working space on top of the text. In this work, we propose two algorithms improving on the space of their original solution. Our model assumes a memory word of [log2 n] bits and a re-writable input text composed by n such words. Our first algorithm runs in expected O(n/ε) time and uses (1+ε)n+√n words of space on top of the text for any parameter 0 <∊ ≤ 1 chosen in advance. Our second algorithm runs in expected O(n log n) time and improves the space to n + √n words.
Original languageEnglish
Title of host publicationProceedings of 2017 Data Compression Conference
VolumePart F1
PublisherIEEE
Publication date2017
Pages171-80
ISBN (Print)978-1-5090-6721-3
DOIs
Publication statusPublished - 2017
Event2017 Data Compression Conference - Snowbird, United States
Duration: 4 Apr 20177 Apr 2017

Conference

Conference2017 Data Compression Conference
Country/TerritoryUnited States
CitySnowbird
Period04/04/201707/04/2017
SeriesData Compression Conference. Proceedings
ISSN1068-0314

Fingerprint

Dive into the research topics of 'Space-Efficient Re-Pair Compression'. Together they form a unique fingerprint.

Cite this