CLMar 31, 2023

Trimming Phonetic Alignments Improves the Inference of Sound Correspondence Patterns from Multilingual Wordlists

arXiv:2303.17932v1263 citationsh-index: 31
Originality Incremental advance
AI Analysis

This work addresses the tedious annotation challenge in historical linguistics for researchers, but it is incremental as it builds on existing alignment methods.

The paper tackles the problem of improving phonetic alignments for inferring sound correspondence patterns in multilingual wordlists by proposing a trimming workflow inspired by evolutionary biology. The results show that the best trimming technique substantially improves alignment consistency, increasing the proportion of frequent correspondence patterns and words with regular cognate relations.

Sound correspondence patterns form the basis of cognate detection and phonological reconstruction in historical language comparison. Methods for the automatic inference of correspondence patterns from phonetically aligned cognate sets have been proposed, but their application to multilingual wordlists requires extremely well annotated datasets. Since annotation is tedious and time consuming, it would be desirable to find ways to improve aligned cognate data automatically. Taking inspiration from trimming techniques in evolutionary biology, which improve alignments by excluding problematic sites, we propose a workflow that trims phonetic alignments in comparative linguistics prior to the inference of correspondence patterns. Testing these techniques on a large standardized collection of ten datasets with expert annotations from different language families, we find that the best trimming technique substantially improves the overall consistency of the alignments. The results show a clear increase in the proportion of frequent correspondence patterns and words exhibiting regular cognate relations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes