CLDec 14, 2018

Detecting Reliable Novel Word Senses: A Network-Centric Approach

Abhik Jana, Animesh Mukherjee, Pawan Goyal

arXiv:1812.05936v10.2

Originality Incremental advance

AI Analysis

This work addresses the challenge of reliable novel word sense detection for computational linguistics, offering an incremental improvement as a post-hoc step.

The paper tackles the problem of detecting new word senses in language evolution by proposing a network-centric approach that uses SVM classification with network features, achieving precision values of 0.86 and 0.74 compared to 0.23-0.32 without the method.

In this era of Big Data, due to expeditious exchange of information on the web, words are being used to denote newer meanings, causing linguistic shift. With the recent availability of large amounts of digitized texts, an automated analysis of the evolution of language has become possible. Our study mainly focuses on improving the detection of new word senses. This paper presents a unique proposal based on network features to improve the precision of new word sense detection. For a candidate word where a new sense (birth) has been detected by comparing the sense clusters induced at two different time points, we further compare the network properties of the subgraphs induced from novel sense cluster across these two time points. Using the mean fractional change in edge density, structural similarity and average path length as features in an SVM classifier, manual evaluation gives precision values of 0.86 and 0.74 for the task of new sense detection, when tested on 2 distinct time-point pairs, in comparison to the precision values in the range of 0.23-0.32, when the proposed scheme is not used. The outlined method can therefore be used as a new post-hoc step to improve the precision of novel word sense detection in a robust and reliable way where the underlying framework uses a graph structure. Another important observation is that even though our proposal is a post-hoc step, it can be used in isolation and that itself results in a very decent performance achieving a precision of 0.54-0.62. Finally, we show that our method is able to detect the well-known historical shifts in 80% cases.

View on arXiv PDF

Similar