CLFeb 15

Character-aware Transformers Learn an Irregular Morphological Pattern Yet None Generalize Like Humans

Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney

arXiv:2602.14100v11.62 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses the problem of cognitive modeling in linguistics, showing that current models fail to replicate human morphological abstraction, which is incremental as it highlights specific gaps without proposing a new solution.

The study investigated whether neural networks can model human-like morphological learning by testing encoder-decoder transformers on the Spanish L-shaped morphome, finding that position-invariant models better captured the pattern but none generalized it to novel forms like humans, who preferentially extended it to the first-person singular indicative.

Whether neural networks can serve as cognitive models of morphological learning remains an open question. Recent work has shown that encoder-decoder models can acquire irregular patterns, but evidence that they generalize these patterns like humans is mixed. We investigate this using the Spanish \emph{L-shaped morphome}, where only the first-person singular indicative (e.g., \textit{pongo} `I put') shares its stem with all subjunctive forms (e.g., \textit{ponga, pongas}) despite lacking apparent phonological, semantic, or syntactic motivation. We compare five encoder-decoder transformers varying along two dimensions: sequential vs. position-invariant positional encoding, and atomic vs. decomposed tag representations. Positional encoding proves decisive: position-invariant models recover the correct L-shaped paradigm clustering even when L-shaped verbs are scarce in training, whereas sequential positional encoding models only partially capture the pattern. Yet none of the models productively generalize this pattern to novel forms. Position-invariant models generalize the L-shaped stem across subjunctive cells but fail to extend it to the first-person singular indicative, producing a mood-based generalization rather than the L-shaped morphomic pattern. Humans do the opposite, generalizing preferentially to the first-person singular indicative over subjunctive forms. None of the models reproduce the human pattern, highlighting the gap between statistical pattern reproduction and morphological abstraction.

View on arXiv PDF

Similar