String attractors

Verification and optimization

Dominik Kempa, Alberto Policriti, Nicola Prezza, Eva Rotenberg

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

69 Downloads (Pure)

Abstract

String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set γ ⊆ [1.n] is a k-attractor for a string S ∈ Σn if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in γ. Finding the smallest k-attractor is NP-hard for k ≥ 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Σ| ∈ O(3+ϵ√log n) for some constant ϵ > 0, despite the problem being NP-hard for large Σ.

Original languageEnglish
Title of host publicationProceedings of 26th European Symposium on Algorithms
Number of pages13
Volume112
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Publication date1 Aug 2018
ISBN (Print)9783959770811
DOIs
Publication statusPublished - 1 Aug 2018
Event26th Annual European Symposium on Algorithms - Helsinki, Finland
Duration: 20 Aug 201822 Aug 2018

Conference

Conference26th Annual European Symposium on Algorithms
CountryFinland
CityHelsinki
Period20/08/201822/08/2018

Keywords

  • Dictionary compression
  • Set cover
  • String attractors

Cite this

Kempa, D., Policriti, A., Prezza, N., & Rotenberg, E. (2018). String attractors: Verification and optimization. In Proceedings of 26th European Symposium on Algorithms (Vol. 112). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.ESA.2018.52
Kempa, Dominik ; Policriti, Alberto ; Prezza, Nicola ; Rotenberg, Eva. / String attractors : Verification and optimization. Proceedings of 26th European Symposium on Algorithms. Vol. 112 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2018.
@inproceedings{fe814d21eb3f4f3180c1196e380cc0c7,
title = "String attractors: Verification and optimization",
abstract = "String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set γ ⊆ [1.n] is a k-attractor for a string S ∈ Σn if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in γ. Finding the smallest k-attractor is NP-hard for k ≥ 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Σ| ∈ O(3+ϵ√log n) for some constant ϵ > 0, despite the problem being NP-hard for large Σ.",
keywords = "Dictionary compression, Set cover, String attractors",
author = "Dominik Kempa and Alberto Policriti and Nicola Prezza and Eva Rotenberg",
year = "2018",
month = "8",
day = "1",
doi = "10.4230/LIPIcs.ESA.2018.52",
language = "English",
isbn = "9783959770811",
volume = "112",
booktitle = "Proceedings of 26th European Symposium on Algorithms",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",

}

Kempa, D, Policriti, A, Prezza, N & Rotenberg, E 2018, String attractors: Verification and optimization. in Proceedings of 26th European Symposium on Algorithms. vol. 112, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 26th Annual European Symposium on Algorithms, Helsinki, Finland, 20/08/2018. https://doi.org/10.4230/LIPIcs.ESA.2018.52

String attractors : Verification and optimization. / Kempa, Dominik; Policriti, Alberto; Prezza, Nicola; Rotenberg, Eva.

Proceedings of 26th European Symposium on Algorithms. Vol. 112 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2018.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - String attractors

T2 - Verification and optimization

AU - Kempa, Dominik

AU - Policriti, Alberto

AU - Prezza, Nicola

AU - Rotenberg, Eva

PY - 2018/8/1

Y1 - 2018/8/1

N2 - String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set γ ⊆ [1.n] is a k-attractor for a string S ∈ Σn if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in γ. Finding the smallest k-attractor is NP-hard for k ≥ 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Σ| ∈ O(3+ϵ√log n) for some constant ϵ > 0, despite the problem being NP-hard for large Σ.

AB - String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set γ ⊆ [1.n] is a k-attractor for a string S ∈ Σn if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in γ. Finding the smallest k-attractor is NP-hard for k ≥ 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Σ| ∈ O(3+ϵ√log n) for some constant ϵ > 0, despite the problem being NP-hard for large Σ.

KW - Dictionary compression

KW - Set cover

KW - String attractors

U2 - 10.4230/LIPIcs.ESA.2018.52

DO - 10.4230/LIPIcs.ESA.2018.52

M3 - Article in proceedings

SN - 9783959770811

VL - 112

BT - Proceedings of 26th European Symposium on Algorithms

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -

Kempa D, Policriti A, Prezza N, Rotenberg E. String attractors: Verification and optimization. In Proceedings of 26th European Symposium on Algorithms. Vol. 112. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2018 https://doi.org/10.4230/LIPIcs.ESA.2018.52