CLApr 13, 2024

Labeled Morphological Segmentation with Semi-Markov Models

ETH Zurich
arXiv:2404.08997v11100 citationsh-index: 70CoNLL
Originality Incremental advance
AI Analysis

This work addresses morphological processing tasks for computational linguistics, offering incremental improvements in segmentation, stemming, and tag classification.

The paper tackled morphological segmentation by introducing a unified framework and a new tagset hierarchy, resulting in absolute F1 improvements of 2-6 points over baselines across six languages.

We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks. From an annotation standpoint, we additionally introduce a new hierarchy of morphotactic tagsets. Finally, we develop \modelname, a discriminative morphological segmentation system that, contrary to previous work, explicitly models morphotactics. We show that \textsc{chipmunk} yields improved performance on three tasks for all six languages: (i) morphological segmentation, (ii) stemming and (iii) morphological tag classification. On morphological segmentation, our method shows absolute improvements of 2--6 points $F_1$ over the baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes