On the Complexity and Typology of Inflectional Morphological Systems
This research addresses the problem of understanding linguistic typology and complexity for linguists and computational linguists, providing a quantitative framework.
The study quantified the complexity of morphological systems across languages and found an empirical trade-off where languages have either large inflectional paradigms or high irregularity, but not both.
We quantify the linguistic complexity of different languages' morphological systems. We verify that there is an empirical trade-off between paradigm size and irregularity: a language's inflectional paradigms may be either large in size or highly irregular, but never both. Our methodology measures paradigm irregularity as the entropy of the surface realization of a paradigm -- how hard it is to jointly predict all the surface forms of a paradigm. We estimate this by a variational approximation. Our measurements are taken on large morphological paradigms from 31 typologically diverse languages.