CLAug 5, 2020

Multiple Texts as a Limiting Factor in Online Learning: Quantifying (Dis-)similarities of Knowledge Networks across Languages

arXiv:2008.02047v11 citations
Originality Synthesis-oriented
AI Analysis

This research addresses the problem of linguistic bias in online learning resources for educators, learners, and researchers, though it is incremental in building on existing computational models.

The study investigated whether the information obtained from Wikipedia on a given topic depends on the language consulted, testing this hypothesis across 25 subject areas and 35 languages. It found that Wikipedia exhibits a language-related linguistic bias, indicating that the extent of information varies significantly by language.

We test the hypothesis that the extent to which one obtains information on a given topic through Wikipedia depends on the language in which it is consulted. Controlling the size factor, we investigate this hypothesis for a number of 25 subject areas. Since Wikipedia is a central part of the web-based information landscape, this indicates a language-related, linguistic bias. The article therefore deals with the question of whether Wikipedia exhibits this kind of linguistic relativity or not. From the perspective of educational science, the article develops a computational model of the information landscape from which multiple texts are drawn as typical input of web-based reading. For this purpose, it develops a hybrid model of intra- and intertextual similarity of different parts of the information landscape and tests this model on the example of 35 languages and corresponding Wikipedias. In this way the article builds a bridge between reading research, educational science, Wikipedia research and computational linguistics.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes