IR CLNov 20, 2018

Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection

arXiv:1811.08129v11092 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem in linguistics for improving cognate detection, presenting incremental advancements over existing methods.

The paper tackles the problem of automatic cognate detection by introducing positional segmentation and graphical error modeling, showing that combining these with language modeling smoothing methods improves results over baselines in both classification and prediction tasks.

Ranking functions in information retrieval are often used in search engines to recommend the relevant answers to the query. This paper makes use of this notion of information retrieval and applies onto the problem domain of cognate detection. The main contributions of this paper are: (1) positional segmentation, which incorporates the sequential notion; (2) graphical error modelling, which deduces the transformations. The current research work focuses on classification problem; which is distinguishing whether a pair of words are cognates. This paper focuses on a harder problem, whether we could predict a possible cognate from the given input. Our study shows that when language modelling smoothing methods are applied as the retrieval functions and used in conjunction with positional segmentation and error modelling gives better results than competing baselines, in both classification and prediction of cognates. Source code is at: https://github.com/pranav-ust/cognates

View on arXiv PDF Code

Similar