CLLGMay 27, 2020

In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology

arXiv:2005.13575v1997 citations
AI Analysis

This work addresses the challenge of modeling historical sound changes in Slavic languages for linguists and computational researchers, representing an incremental advancement in applying neural methods to historical phonology.

The paper tackled the problem of learning diachronic phonological generalizations in a multilingual setting using neural networks, finding that the Straight-Through model outperformed others in accuracy while the Sigmoid model's embeddings best aligned with traditional Slavic subgrouping.

This paper investigates the ability of neural network architectures to effectively learn diachronic phonological generalizations in a multilingual setting. We employ models using three different types of language embedding (dense, sigmoid, and straight-through). We find that the Straight-Through model outperforms the other two in terms of accuracy, but the Sigmoid model's language embeddings show the strongest agreement with the traditional subgrouping of the Slavic languages. We find that the Straight-Through model has learned coherent, semi-interpretable information about sound change, and outline directions for future research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes