Vojtěch Havlena

LO
3papers
2citations
Novelty52%
AI Score44

3 Papers

LOApr 15
Towards Efficient Matching of Regexes with Backreferences using Register Set Automata (Technical Report)

Vojtěch Havlena, Lukáš Holík, Ondřej Lengál et al.

Matching regexes (regular expressions) is a common problem in many areas of computer science, with requirements on high speed and robust performance. Regexes with backreferences allow one to express certain patterns (even beyond regular) concisely, however, since the matching is usually done by backtracking, the matching speed can degrade to a degree that constitutes a service failure or a security threat. To facilitate high-speed matching of such regexes, we propose register set automata (RSAs), an extension of register automata where registers can contain sets of symbols (from a potentially infinite alphabet) and the following operations are supported: adding input values to registers, merging or clearing registers, and testing whether a register contains a value. We show that a large class of register automata can be transformed into deterministic RSAs, which can serve as a basis for fast matching of a family of regexes with single-letter capture groups and backreferences. We also give a derivative-based algorithm for transforming a large class of regexes with backreferences to register automata and show that the time complexity of matching is linear and quadratic to the length of the input for finite and infinite alphabets respectively. Our prototype implementation of a regex matcher shows that our approach can significantly improve the robustness of state-of-the-art regex matchers on regexes with backreferences. We also study the theoretical properties of the model and show that the emptiness problem for RSAs is decidable and complete for the $\mathbf{F}_ω$ class and that RSAs are incomparable in expressive power to other popular automata models over data words.

FLMay 14
String Solving with Stabilization and Transducers (Technical Report)

David Chocholatý, Vojtěch Havlena, Lukáš Holík et al.

We generalize an efficient automata-based approach to string constraint solving, the stabilization-based method behind the solver Z3-Noodler, to support relational constraints represented by finite-state transducers (useful, for example, for modeling replaceAll constraints). We focus on an efficient treatment of length constraints by reducing the need for expensive concatenation elimination, which is a major bottleneck in automata-based string solving. We also propose powerful heuristics that significantly improve performance in practice. Implemented on top of Z3-Noodler, our method vastly outperforms existing solvers on benchmarks with relational constraints. It solves more instances and runs orders of magnitude faster.

LOMay 14
Kofola 1.0: A Modular Approach to ω-Regular Complementation and Inclusion Checking (Technical Report)

Ondrej Alexaj, Vojtěch Havlena, Lukáš Holík et al.

We present Kofola, an efficient tool for complementation and inclusion checking of Büchi automata, two central tasks in automata-theoretic verification with applications in model checking, monitoring, and theorem proving. Kofola implements a state-of-the-art modular complementation framework that decomposes the input automaton into strongly connected components and applies to each component a complementation algorithm tailored to its structural properties. Building on this modular construction, Kofola also provides modular inclusion checking with new heuristics. A key ingredient is a new on-the-fly emptiness-checking algorithm for the simple generalized Rabin pair condition produced by our complementation, allowing the search to terminate as soon as the explored state space suffices. Empirical evaluation shows that Kofola is highly competitive with state-of-the-art complementation and inclusion-checking tools: it is the most robust tool in our evaluation and often outperforms competitors by several orders of magnitude on benchmarks from practical applications.