Exploring Linguistic Probes for Morphological Generalization
This work addresses the need for better evaluation of morphological generalization in computational linguistics, though it is incremental as it supplements existing data splitting methods with new probes.
The paper tackled the problem of understanding how morphological inflection systems generalize across languages by introducing language-specific probes, and found that three leading systems use distinct strategies for conjugational classes and features on orthographic and phonological inputs in English, Spanish, and Swahili.
Modern work on the cross-linguistic computational modeling of morphological inflection has typically employed language-independent data splitting algorithms. In this paper, we supplement that approach with language-specific probes designed to test aspects of morphological generalization. Testing these probes on three morphologically distinct languages, English, Spanish, and Swahili, we find evidence that three leading morphological inflection systems employ distinct generalization strategies over conjugational classes and feature sets on both orthographic and phonologically transcribed inputs.