CLJun 2, 2021

Lower Perplexity is Not Always Human-Like

arXiv:2106.01229v2723 citations
AI Analysis

This work highlights a critical gap in computational psycholinguistics by showing that findings from English may not generalize to other languages, emphasizing the need for cross-lingual evaluation to build truly human-like models.

The paper investigates whether the generalization that lower perplexity in language models correlates with human-like behavior, established in English, holds in Japanese, finding it does not, suggesting a lack of universality across languages.

In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the established results in computational psycholinguistics can be generalized across languages. Specifically, we re-examine an established generalization -- the lower perplexity a language model has, the more human-like the language model is -- in Japanese with typologically different structures from English. Our experiments demonstrate that this established generalization exhibits a surprising lack of universality; namely, lower perplexity is not always human-like. Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)uniform information density. Overall, our results suggest that a cross-lingual evaluation will be necessary to construct human-like computational models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes