Approaching Reflex Predictions as a Classification Problem Using Extended Phonological Alignments
This work addresses a specific challenge in computational linguistics for researchers, but it is incremental as it builds on existing methods.
The paper tackles the problem of predicting cognate reflexes by framing it as a classification task using extended phonological alignments, achieving results evaluated on a shared task with a random forest model.
This work describes an implementation of the "extended alignment" (or "multitiers") approach for cognate reflex prediction, submitted to "Prediction of Cognate Reflexes" shared task. Similarly to List2022d, the technique involves an automatic extension of sequence alignments with multilayered vectors that encode informational tiers on both site-specific traits, such as sound classes and distinctive features, as well as contextual and suprasegmental ones, conveyed by cross-site referrals and replication. The method allows to generalize the problem of cognate reflex prediction as a classification problem, with models trained using a parallel corpus of cognate sets. A model using random forests is trained and evaluated on the shared task for reflex prediction, and the experimental results are presented and discussed along with some differences to other implementations.