CLHCApr 2, 2019

The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context

arXiv:1904.01689v1199 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of language barriers in knowledge representation for developers and researchers building culturally-aware or hyperlingual applications, though it is incremental as it builds on existing studies of Wikipedia diversity.

The study examined the fragmenting effect of language on user-generated content by analyzing knowledge diversity across 25 Wikipedia language editions, finding greater diversity than previously assumed and showing its significant impact on applications using Wikipedia as a knowledge source.

This study explores language's fragmenting effect on user-generated content by examining the diversity of knowledge representations across 25 different Wikipedia language editions. This diversity is measured at two levels: the concepts that are included in each edition and the ways in which these concepts are described. We demonstrate that the diversity present is greater than has been presumed in the literature and has a significant influence on applications that use Wikipedia as a source of world knowledge. We close by explicating how knowledge diversity can be beneficially leveraged to create "culturally-aware applications" and "hyperlingual applications".

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes