Faster Sliding Window String Indexing in Streams

Research output: Contribution to journalConference articleResearchpeer-review

16 Downloads (Orbit)

Abstract

The classical string indexing problem asks to preprocess the input string S for efficient pattern matching queries. Bille, Fischer, Gørtz, Pedersen, and Stordalen [CPM 2023] generalized this to the streaming sliding window string indexing problem, where the input string S arrives as a stream, and we are asked to maintain an index of the last w characters, called the window. Further, at any point in time, a pattern P might appear, again given as a stream, and all occurrences of P in the current window must be output. We require that the time to process each character of the text or the pattern is worst-case. It appears that standard string indexing structures, such as suffix trees, do not provide an efficient solution in such a setting, as to obtain a good worst-case bound, they necessarily need to work right-to-left, and we cannot reverse the pattern while keeping a worst-case guarantee on the time to process each of its characters. Nevertheless, it is possible to obtain a bound of O(log w) (with high probability) by maintaining a hierarchical structure of multiple suffix trees. We significantly improve this upper bound by designing a black-box reduction to maintain a suffix tree under prepending characters to the current text. By plugging in the known results, this allows us to obtain a bound of O(log log w + log log σ) (with high probability), where σ is the size of the alphabet. Further, we introduce an even more general problem, called the streaming dynamic window string indexing, where the goal is to maintain the current text under adding and deleting characters at either end and design a similar black-box reduction.
Original languageEnglish
JournalLeibniz International Proceedings in Informatics, LIPIcs
Volume296
Pages (from-to)8:1-8:14
ISSN1868-8969
DOIs
Publication statusPublished - 2024
Event35th Annual Symposium on Combinatorial Pattern Matching - Fukuoka, Japan
Duration: 25 Jun 202427 Jun 2024

Conference

Conference35th Annual Symposium on Combinatorial Pattern Matching
Country/TerritoryJapan
CityFukuoka
Period25/06/202427/06/2024

Keywords

  • Data structures
  • Pattern matching
  • String indexing

Fingerprint

Dive into the research topics of 'Faster Sliding Window String Indexing in Streams'. Together they form a unique fingerprint.

Cite this