IRCLNov 20, 2018

Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection

arXiv:1811.08129v11092 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem in linguistics for improving cognate detection, presenting incremental advancements over existing methods.

The paper tackles the problem of automatic cognate detection by introducing positional segmentation and graphical error modeling, showing that combining these with language modeling smoothing methods improves results over baselines in both classification and prediction tasks.

Ranking functions in information retrieval are often used in search engines to recommend the relevant answers to the query. This paper makes use of this notion of information retrieval and applies onto the problem domain of cognate detection. The main contributions of this paper are: (1) positional segmentation, which incorporates the sequential notion; (2) graphical error modelling, which deduces the transformations. The current research work focuses on classification problem; which is distinguishing whether a pair of words are cognates. This paper focuses on a harder problem, whether we could predict a possible cognate from the given input. Our study shows that when language modelling smoothing methods are applied as the retrieval functions and used in conjunction with positional segmentation and error modelling gives better results than competing baselines, in both classification and prediction of cognates. Source code is at: https://github.com/pranav-ust/cognates

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes