CLSep 21, 2025

Modeling Bottom-up Information Quality during Language Processing

arXiv:2509.17047v21 citationsh-index: 11EMNLP
Originality Synthesis-oriented
AI Analysis

This work addresses a specific prediction in psycholinguistic models for researchers, but it is incremental as it operationalizes and tests an existing theoretical concept with new data.

The study tackled the problem of how bottom-up information quality affects language processing by proposing an information-theoretic measure based on mutual information and testing it with reading experiments in English and Chinese, finding that reduced visual information (e.g., occluding word halves) increases reading times and that the upper half contains more information, with greater asymmetry in English.

Contemporary theories model language processing as integrating both top-down expectations and bottom-up inputs. One major prediction of such models is that the quality of the bottom-up inputs modulates ease of processing -- noisy inputs should lead to difficult and effortful comprehension. We test this prediction in the domain of reading. First, we propose an information-theoretic operationalization for the "quality" of bottom-up information as the mutual information (MI) between visual information and word identity. We formalize this prediction in a mathematical model of reading as a Bayesian update. Second, we test our operationalization by comparing participants' reading times in conditions where words' information quality has been reduced, either by occluding their top or bottom half, with full words. We collect data in English and Chinese. We then use multimodal language models to estimate the mutual information between visual inputs and words. We use these data to estimate the specific effect of reduced information quality on reading times. Finally, we compare how information is distributed across visual forms. In English and Chinese, the upper half contains more information about word identity than the lower half. However, the asymmetry is more pronounced in English, a pattern which is reflected in the reading times.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes