CLIRApr 28, 2024

Mapping 'when'-clauses in Latin American and Caribbean languages: an experiment in subtoken-based typology

arXiv:2404.18257v127 citationsh-index: 1AMERICASNLP
Originality Incremental advance
AI Analysis

This enables strategy-agnostic typological analysis of temporal subordination, addressing a gap for linguists studying languages with morphological marking.

This paper tackles the challenge of studying morphological variation in temporal subordination ('when'-clauses) in Latin American and Caribbean languages, where such marking is common but poorly understood, by presenting probabilistic semantic maps computed from these languages to capture both lexical and morphological devices.

Languages can encode temporal subordination lexically, via subordinating conjunctions, and morphologically, by marking the relation on the predicate. Systematic cross-linguistic variation among the former can be studied using well-established token-based typological approaches to token-aligned parallel corpora. Variation among different morphological means is instead much harder to tackle and therefore more poorly understood, despite being predominant in several language groups. This paper explores variation in the expression of generic temporal subordination ('when'-clauses) among the languages of Latin America and the Caribbean, where morphological marking is particularly common. It presents probabilistic semantic maps computed on the basis of the languages of the region, thus avoiding bias towards the many world's languages that exclusively use lexified connectors, incorporating associations between character $n$-grams and English $when$. The approach allows capturing morphological clause-linkage devices in addition to lexified connectors, paving the way for larger-scale, strategy-agnostic analyses of typological variation in temporal subordination.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes