FL IRMar 14, 2019

Regular Expressions with Backreferences: Polynomial-Time Matching Techniques

arXiv:1903.05896v22 citations

Originality Incremental advance

AI Analysis

This addresses the computational inefficiency in regex matching for developers and theorists, though it is incremental as it focuses on specific bounded cases rather than solving the general NP-complete problem.

The paper tackled the NP-complete matching problem for regular expressions with backreferences by introducing a complexity parameter (active variable degree) that enables polynomial-time matching when bounded, and defined memory-deterministic regex for efficient matching in O(|w|p(|r|)) time.

Regular expressions with backreferences (regex, for short), as supported by most modern libraries for regular expression matching, have an NP-complete matching problem. We define a complexity parameter of regex, called active variable degree, such that regex with this parameter bounded by a constant can be matched in polynomial-time. Moreover, we formulate a novel type of determinism for regex (on an automaton-theoretic level), which yields the class of memory-deterministic regex that can be matched in time O(|w|p(|r|)) for a polynomial p (where r is the regex and w the word). Natural extensions of these concepts lead to properties of regex that are intractable to check.

View on arXiv PDF

Similar