CL AINov 22, 2019

Anaphora Resolution in Dialogue Systems for South Asian Languages

Vinay Annam, Nikhil Koditala, Radhika Mamidi

arXiv:1911.09994v10.35 citations

Originality Incremental advance

AI Analysis

This addresses a critical NLP problem for dialogue systems in under-resourced South Asian languages, though it is incremental as it applies neural methods to a known bottleneck.

The paper tackles anaphora resolution in free word order South Asian languages, specifically Telugu, by proposing a neural network-based system that achieves an F1-score of 86% on a generated conversation corpus.

Anaphora resolution is a challenging task which has been the interest of NLP researchers for a long time. Traditional resolution techniques like eliminative constraints and weighted preferences were successful in many languages. However, they are ineffective in free word order languages like most SouthAsian languages.Heuristic and rule-based techniques were typical in these languages, which are constrained to context and domain.In this paper, we venture a new strategy us-ing neural networks for resolving anaphora in human-human dialogues. The architecture chiefly consists of three components, a shallow parser for extracting features, a feature vector generator which produces the word embed-dings, and a neural network model which will predict the antecedent mention of an anaphora.The system has been trained and tested on Telugu conversation corpus we generated. Given the advantage of the semantic information in word embeddings and appending actor, gender, number, person and part of plural features the model has reached an F1-score of 86.

View on arXiv PDF

Similar