FLCLSep 14, 2022

On the Intersection of Context-Free and Regular Languages

CambridgeMicrosoft
arXiv:2209.06809v2269 citationsh-index: 54
AI Analysis

This addresses a technical limitation in formal language theory for researchers and practitioners, but it is incremental as it extends an existing method.

The paper tackles the problem that the classic Bar-Hillel construction cannot handle finite-state automata with ε-arcs when intersecting context-free and regular languages, and presents a generalized construction that works with ε-arcs while maintaining asymptotic size.

The Bar-Hillel construction is a classic result in formal language theory. It shows, by a simple construction, that the intersection of a context-free language and a regular language is itself context-free. In the construction, the regular language is specified by a finite-state automaton. However, neither the original construction (Bar-Hillel et al., 1961) nor its weighted extension (Nederhof and Satta, 2003) can handle finite-state automata with $\varepsilon$-arcs. While it is possible to remove $\varepsilon$-arcs from a finite-state automaton efficiently without modifying the language, such an operation modifies the automaton's set of paths. We give a construction that generalizes the Bar-Hillel in the case where the desired automaton has $\varepsilon$-arcs, and further prove that our generalized construction leads to a grammar that encodes the structure of both the input automaton and grammar while retaining the asymptotic size of the original construction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes