Context based Analysis of Lexical Semantics for Hindi Language
This work addresses lexical semantics for Hindi, an under-resourced language, by incrementally improving corpus availability and disambiguation methods.
The authors tackled the problem of word sense disambiguation for Hindi by enriching a sense-tagged corpus with new senses for 60 polysemous words and analyzing two novel lexical associations based on contextual features, achieving favorable results in evaluation with learning algorithms.
A word having multiple senses in a text introduces the lexical semantic task to find out which particular sense is appropriate for the given context. One such task is Word sense disambiguation which refers to the identification of the most appropriate meaning of the polysemous word in a given context using computational algorithms. The language processing research in Hindi, the official language of India, and other Indian languages is restricted by unavailability of the standard corpus. For Hindi word sense disambiguation also, the large corpus is not available. In this work, we prepared the text containing new senses of certain words leading to the enrichment of the sense-tagged Hindi corpus of sixty polysemous words. Furthermore, we analyzed two novel lexical associations for Hindi word sense disambiguation based on the contextual features of the polysemous word. The evaluation of these methods is carried out over learning algorithms and favorable results are achieved.