MLCLCYSOC-PHAPNov 16, 2015

Learning about Spanish dialects through Twitter

arXiv:1511.04970v21 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of understanding Spanish dialect variation for linguists and computational researchers, but it is incremental as it applies existing methods to new social media data.

The paper tackled the problem of mapping Spanish language variation by analyzing geographically tagged Twitter messages, resulting in maps showing unprecedented global linguistic variation and revealing that urban varieties have an international character while rural areas show regional uniformity.

This paper maps the large-scale variation of the Spanish language by employing a corpus based on geographically tagged Twitter messages. Lexical dialects are extracted from an analysis of variants of tens of concepts. The resulting maps show linguistic variation on an unprecedented scale across the globe. We discuss the properties of the main dialects within a machine learning approach and find that varieties spoken in urban areas have an international character in contrast to country areas where dialects show a more regional uniformity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes