FLLGMay 12

Finite Sentence-Interface Control for Learning Bounded-Fan-Out Linear MCFGs under Fixed Monoid Typing

arXiv:2605.1164433.9
Predicted impact top 35% in FL · last 90 daysOriginality Incremental advance
AI Analysis

For computational linguists and grammar induction researchers, this work provides a theoretical foundation for learning mildly context-sensitive grammars from positive examples, though it is an incremental extension of existing distributional learning techniques.

This paper extends distributional learning from context-free grammars to bounded-fan-out linear multiple context-free grammars (MCFGs) by introducing sentence-interface types as finite control objects. The authors prove that for fixed fan-out bound and fixed monoid homomorphism, the resulting class is identifiable in the limit from positive data, and the learner runs in polynomial time.

We study positive-data learning of bounded-fan-out linear multiple context-free grammars under a fixed explicit finite monoid homomorphism \(h\). The main obstacle beyond the context-free case is that an MCFG nonterminal derives a tuple whose components may be placed in a surrounding sentence in different orders. We introduce sentence-interface types as finite external control objects for such tuple occurrences. A type records the permutation of tuple components in the final sentence together with the \(h\)-values of the boundary intervals between them. For reduced working binary linear nondeleting MCFG presentations whose string languages satisfy \((f,h)\)-tuple substitutability, we build a typed refinement, a finite characteristic sample, and a canonical positive-data learner. Once the sample contains this characteristic sample and remains contained in the target language, the learner reconstructs the language exactly. Consequently, for fixed fan-out bound \(f\) and fixed explicit \(h\), the resulting class is identifiable in the limit from positive data. Moreover, the hypothesis associated with any given finite sample is constructible in polynomial time for fixed \(f\) and fixed \(h\), including output size. Thus sentence-interface control is the finite mechanism that lifts fixed-\(h\) distributional reconstruction from context-free grammars to bounded-fan-out linear MCFGs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes