Algorithms and data structures for grammar-compressed strings

  • Patrick Hagge Cording

Research output: Book/ReportPh.D. thesis

1788 Downloads (Orbit)

Abstract

Textual databases for e.g. biological or web-data are growing rapidly, and it is often only feasible to store the data in compressed form. However, compressing the data comes at a price. Traditional algorithms for e.g. pattern matching requires all data to be decompressed - a computationally demanding task. In this thesis we design data structures for accessing and searching compressed data efficiently.

Our results can be divided into two categories. In the first category we study problems related to pattern matching. In particular, we present new algorithms for counting and comparing substrings, and a new algorithm for finding all occurrences of a pattern in which we may insert gaps. In the other category we deal with accessing and decompressing parts of the compressed string. We show how to quickly access a single character of the compressed string, and present a data structure that supports fast decompression of substrings from prespecified positions.
Original languageEnglish
Place of PublicationKgs. Lyngby
PublisherTechnical University of Denmark
Number of pages66
Publication statusPublished - 2015
SeriesDTU Compute PHD-2014
Number357
ISSN0909-3192

Fingerprint

Dive into the research topics of 'Algorithms and data structures for grammar-compressed strings'. Together they form a unique fingerprint.

Cite this