Morphological Segmentation Inside-Out
This work addresses the need for hierarchical morphological analysis in natural language processing, particularly for derivational morphology, and is incremental as it builds on existing segmentation methods by adding a context-free model.
The paper tackles the problem of morphological segmentation by introducing a discriminative, joint model that incorporates hierarchical structure and orthographic changes, and releases an annotated treebank of 7454 English words to support future research.
Morphological segmentation has traditionally been modeled with non-hierarchical models, which yield flat segmentations as output. In many cases, however, proper morphological analysis requires hierarchical structure -- especially in the case of derivational morphology. In this work, we introduce a discriminative, joint model of morphological segmentation along with the orthographic changes that occur during word formation. To the best of our knowledge, this is the first attempt to approach discriminative segmentation with a context-free model. Additionally, we release an annotated treebank of 7454 English words with constituency parses, encouraging future research in this area.