Takayuki Kuriyama

2papers

2 Papers

82.5FLMay 8
Distributional Learning of Context-Free Languages under Fixed Finite-Monoid Typing

Takayuki Kuriyama

We study distributional learning of context-free languages under a fixed recognizable congruence $\sim_h$ given as the kernel of an explicit finite monoid homomorphism $h:Σ^*\to M$. For this fixed-$h$ setting, we develop a finite typed reconstruction theory for context-free $\sim_h$-substitutable languages. Starting from a reduced context-free grammar, we introduce a typed refinement that records both yield types and outer context types, show that the relevant structure is concentrated in a finite typed reconstruction basis, and prove that this basis is exposed by a finite observation set. Occurrences of the same nonterminal symbol may therefore have to be separated when their outer $h$-contexts differ. We then prove exact reconstruction from positive data. From any finite sample $K\subseteqΣ^*$, we construct a canonical hypothesis grammar $\hat G(K)$, and we show that once $K$ contains the finite observation set associated with the target typed grammar, $\hat G(K)$ generates the target language exactly. Consequently, for every explicit finite monoid homomorphism $h$, the class $\mathcal C_h^{\mathrm{cf}}$ of context-free $\sim_h$-substitutable languages is identifiable in the limit from positive data, with polynomial-time hypothesis construction and update. For the linear subclass $\mathcal C_h^{\mathrm{lin}}$, we further prove polynomial upper bounds on characteristic-sample size and word length. Thus the same learner gives a full polynomial time-and-data result for the linear subclass.

56.1FLMay 12
Finite Sentence-Interface Control for Learning Bounded-Fan-Out Linear MCFGs under Fixed Monoid Typing

Takayuki Kuriyama

We study positive-data learning of bounded-fan-out linear multiple context-free grammars under a fixed explicit finite monoid homomorphism \(h\). The main obstacle beyond the context-free case is that an MCFG nonterminal derives a tuple whose components may be placed in a surrounding sentence in different orders. We introduce sentence-interface types as finite external control objects for such tuple occurrences. A type records the permutation of tuple components in the final sentence together with the \(h\)-values of the boundary intervals between them. For reduced working binary linear nondeleting MCFG presentations whose string languages satisfy \((f,h)\)-tuple substitutability, we build a typed refinement, a finite characteristic sample, and a canonical positive-data learner. Once the sample contains this characteristic sample and remains contained in the target language, the learner reconstructs the language exactly. Consequently, for fixed fan-out bound \(f\) and fixed explicit \(h\), the resulting class is identifiable in the limit from positive data. Moreover, the hypothesis associated with any given finite sample is constructible in polynomial time for fixed \(f\) and fixed \(h\), including output size. Thus sentence-interface control is the finite mechanism that lifts fixed-\(h\) distributional reconstruction from context-free grammars to bounded-fan-out linear MCFGs.