We consider compact representations of collections of similar strings that support random access queries. The collection of strings is given by a rooted tree where edges are labeled by an edit operation (inserting, deleting, or replacing a character) and a node represents the string obtained by applying the sequence of edit operations on the path from the root to the node. The goal is to compactly represent the entire collection while supporting fast random access to any part of a string in the collection. This problem captures natural scenarios such as representing the past history of an edited document or representing highly-repetitive collections. Given a tree with n nodes, we show how to represent the corresponding collection in O(n) space and optimal O(log n/log log n) query time. This improves the previous time-space trade-offs for the problem. To obtain our results, we introduce new techniques and ideas, including a reduction to a new geometric line segment selection together with an efficient solution.
|Title of host publication
|Proceedings of 31st International Symposium on Algorithms and Computation
|Yixin Cao, Siu-Wing Cheng, Minming Li
|Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
|Published - Dec 2020
|31st International Symposium on Algorithms and Computation - Virtual, Hong Kong, China
Duration: 14 Dec 2020 → 18 Dec 2020
Conference number: 31
|31st International Symposium on Algorithms and Computation
|Virtual, Hong Kong
|14/12/2020 → 18/12/2020
|Leibniz International Proceedings in Informatics, LIPIcs
Bibliographical noteFunding Information:
Supported by the Danish Research Council (DFF ? 4005-00267, DFF ? 1323-00178).
© Philip Bille and Inge Li Gørtz.
- Data compression
- Data structures
- Persistent strings