Abstract
The classical string indexing problem asks to preprocess the input string S for efficient pattern matching queries. Bille, Fischer, Gørtz, Pedersen, and Stordalen [CPM 2023] generalized this to the streaming sliding window string indexing problem, where the input string S arrives as a stream, and we are asked to maintain an index of the last w characters, called the window. Further, at any point in time, a pattern P might appear, again given as a stream, and all occurrences of P in the current window must be output. We require that the time to process each character of the text or the pattern is worst-case. It appears that standard string indexing structures, such as suffix trees, do not provide an efficient solution in such a setting, as to obtain a good worst-case bound, they necessarily need to work right-to-left, and we cannot reverse the pattern while keeping a worst-case guarantee on the time to process each of its characters. Nevertheless, it is possible to obtain a bound of O(log w) (with high probability) by maintaining a hierarchical structure of multiple suffix trees. We significantly improve this upper bound by designing a black-box reduction to maintain a suffix tree under prepending characters to the current text. By plugging in the known results, this allows us to obtain a bound of O(log log w + log log σ) (with high probability), where σ is the size of the alphabet. Further, we introduce an even more general problem, called the streaming dynamic window string indexing, where the goal is to maintain the current text under adding and deleting characters at either end and design a similar black-box reduction.
Original language | English |
---|---|
Journal | Leibniz International Proceedings in Informatics, LIPIcs |
Volume | 296 |
Pages (from-to) | 8:1-8:14 |
ISSN | 1868-8969 |
DOIs | |
Publication status | Published - 2024 |
Event | 35th Annual Symposium on Combinatorial Pattern Matching - Fukuoka, Japan Duration: 25 Jun 2024 → 27 Jun 2024 |
Conference
Conference | 35th Annual Symposium on Combinatorial Pattern Matching |
---|---|
Country/Territory | Japan |
City | Fukuoka |
Period | 25/06/2024 → 27/06/2024 |
Keywords
- Data structures
- Pattern matching
- String indexing