CLLGApr 24, 2024

Neural Proto-Language Reconstruction

CMU
arXiv:2404.15690v13 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the painstaking process of proto-language reconstruction for linguists, but it is incremental as it builds on existing methods like RNNs and Transformers.

The paper tackled the problem of automating proto-form reconstruction in linguistics by improving computational models, resulting in better performance on the WikiHan dataset and stabilized training with a VAE-enhanced Transformer and data augmentation.

Proto-form reconstruction has been a painstaking process for linguists. Recently, computational models such as RNN and Transformers have been proposed to automate this process. We take three different approaches to improve upon previous methods, including data augmentation to recover missing reflexes, adding a VAE structure to the Transformer model for proto-to-language prediction, and using a neural machine translation model for the reconstruction task. We find that with the additional VAE structure, the Transformer model has a better performance on the WikiHan dataset, and the data augmentation step stabilizes the training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes