### Abstract

Original language | English |
---|---|

Title of host publication | Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms |

Place of Publication | San Francisco |

Publisher | Society for Industrial and Applied Mathematics |

Publication date | 2011 |

Pages | 373-389 |

Publication status | Published - 2011 |

Event | Annual ACM-SIAM Symposium on Discrete Algorithms - San Francisco, California, USA Duration: 1 Jan 2011 → … Conference number: 22 |

### Conference

Conference | Annual ACM-SIAM Symposium on Discrete Algorithms |
---|---|

Number | 22 |

City | San Francisco, California, USA |

Period | 01/01/2011 → … |

### Cite this

*Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms*(pp. 373-389). San Francisco: Society for Industrial and Applied Mathematics.

}

*Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms.*Society for Industrial and Applied Mathematics, San Francisco, pp. 373-389, Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, USA, 01/01/2011.

**Random Access to Grammar-Compressed Strings.** / Bille, Philip; Landau, Gad M.; Raman, Rajeev; Sadakane, Kunihiko; Satti, Srinivasa Rao; Weimann, Oren.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

TY - GEN

T1 - Random Access to Grammar-Compressed Strings

AU - Bille, Philip

AU - Landau, Gad M.

AU - Raman, Rajeev

AU - Sadakane, Kunihiko

AU - Satti, Srinivasa Rao

AU - Weimann, Oren

PY - 2011

Y1 - 2011

N2 - Let S be a string of length N compressed into a context- free grammar S of size n. We present two representa- tions of S achieving O(logN) random access time, and either O(n-ak(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, k(n) is the inverse of the kth row of Ackermann's function. Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and ad- ditional O(m) time. Combining these results with fast algorithms for uncompressed approximate string match- ing leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can nd all approximate occurrences of a pattern P with at most k errors in time O(n(minfjPjk; k4 +jPjg+logN)+occ), where occ is the number of occurrences of P in S. Finally, we are able to generalize our results to navigation and other operations on grammar-compressed trees. All of the above bounds signicantly improve the cur- rently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data struc- ture, two "biased" weighted ancestor data structures, and a compact representation of heavy-paths in grammars.

AB - Let S be a string of length N compressed into a context- free grammar S of size n. We present two representa- tions of S achieving O(logN) random access time, and either O(n-ak(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, k(n) is the inverse of the kth row of Ackermann's function. Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and ad- ditional O(m) time. Combining these results with fast algorithms for uncompressed approximate string match- ing leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can nd all approximate occurrences of a pattern P with at most k errors in time O(n(minfjPjk; k4 +jPjg+logN)+occ), where occ is the number of occurrences of P in S. Finally, we are able to generalize our results to navigation and other operations on grammar-compressed trees. All of the above bounds signicantly improve the cur- rently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data struc- ture, two "biased" weighted ancestor data structures, and a compact representation of heavy-paths in grammars.

M3 - Article in proceedings

SP - 373

EP - 389

BT - Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms

PB - Society for Industrial and Applied Mathematics

CY - San Francisco

ER -