Fast Practical Compression of Deterministic Finite Automata

Philip Bille*, Inge Li Gørtz, Max Rishøj Pedersen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

We revisit the popular delayed deterministic finite automaton (D2FA) compression algorithm introduced by Kumar et al. [SIGCOMM 2006] for compressing deterministic finite automata (DFAs) used in intrusion detection systems. This compression scheme exploits similarities in the outgoing sets of transitions among states to achieve strong compression while maintaining high throughput for matching.
Unfortunately, the D2FA algorithm and later variants of it require at least quadratic compression time since they compare all pairs of states to compute an optimal compression. This is too slow and, in some cases, even infeasible for collections of regular expression in modern intrusion detection systems that produce DFAs of millions of states.
Our main result is a simple, general framework for constructing D2FA based on locality-sensitive hashing that constructs an approximation of the optimal D2FA in near-linear time. We apply our approach to the original D2FA compression algorithm and two important variants, and we experimentally evaluate our algorithms on DFAs from widely used modern intrusion detection systems. Overall, our new algorithms compress up to an order of magnitude faster than existing solutions with either no or little loss of compression size. Consequently, our algorithms are significantly more scalable and can handle larger collections of regular expressions than previous solutions.
Original languageEnglish
Title of host publication50th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2025
Volume15538
PublisherSpringer
Publication date2025
Pages136-150
ISBN (Electronic)978-3-031-82670-2
DOIs
Publication statusPublished - 2025
Event50th International Conference on Current Trends in Theory and Practice of Computer Science - Bratislava, Slovakia
Duration: 20 Jan 202523 Jan 2025

Conference

Conference50th International Conference on Current Trends in Theory and Practice of Computer Science
Country/TerritorySlovakia
CityBratislava
Period20/01/202523/01/2025
SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN0302-9743

Keywords

  • Algorithms
  • Data Compression
  • Finite Automata

Fingerprint

Dive into the research topics of 'Fast Practical Compression of Deterministic Finite Automata'. Together they form a unique fingerprint.

Cite this