FLAILOLOMay 15, 2024

$O_2$ is a multiple context-free grammar: an implementation-, formalisation-friendly proof

arXiv:2405.09396v1h-index: 9DLT
Originality Incremental advance
AI Analysis

This work addresses a foundational problem in computational linguistics and theory of computation by improving proof methods for language classification, though it is incremental as it focuses on a specific case ($n \leq 2$) within a broader open problem.

The paper tackles the problem of generating $n$-balanced languages with multiple context-free grammars (MCFGs) by analyzing existing proofs for their suitability in creating verified parsing algorithms, concluding they are inadequate and providing a new, elementary proof for the case $n \leq 2$ to advance towards a verified algorithm for $O_2$.

Classifying formal languages according to the expressiveness of grammars able to generate them is a fundamental problem in computational linguistics and, therefore, in the theory of computation. Furthermore, such kind of analysis can give insight into the classification of abstract algebraic structure such as groups, for example through the correspondence given by the word problem. While many such classification problems remain open, others have been settled. Recently, it was proved that $n$-balanced languages (i.e., whose strings contain the same occurrences of letters $a_i$ and $A_i$ with $1\leq i \leq n$) can be generated by multiple context-free grammars (MCFGs), which are one of the several slight extensions of context free grammars added to the classical Chomsky hierarchy to make the mentioned classification more precise. This paper analyses the existing proofs from the computational and the proof-theoretical point of views, systematically studying whether each proof can lead to a verified (i.e., checked by a proof assistant) algorithm parsing balanced languages via MCFGs. We conclude that none of the existing proofs is realistically suitable against this practical goal, and proceed to provide a radically new, elementary, extremely short proof for the crucial case $n \leq 2$. A comparative analysis with respect to the existing proofs is finally performed to justify why the proposed proof is a substantial step towards concretely obtaining a verified parsing algorithm for $O_2$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes