CLAINov 22, 2019

Anaphora Resolution in Dialogue Systems for South Asian Languages

arXiv:1911.09994v15 citations
Originality Incremental advance
AI Analysis

This addresses a critical NLP problem for dialogue systems in under-resourced South Asian languages, though it is incremental as it applies neural methods to a known bottleneck.

The paper tackles anaphora resolution in free word order South Asian languages, specifically Telugu, by proposing a neural network-based system that achieves an F1-score of 86% on a generated conversation corpus.

Anaphora resolution is a challenging task which has been the interest of NLP researchers for a long time. Traditional resolution techniques like eliminative constraints and weighted preferences were successful in many languages. However, they are ineffective in free word order languages like most SouthAsian languages.Heuristic and rule-based techniques were typical in these languages, which are constrained to context and domain.In this paper, we venture a new strategy us-ing neural networks for resolving anaphora in human-human dialogues. The architecture chiefly consists of three components, a shallow parser for extracting features, a feature vector generator which produces the word embed-dings, and a neural network model which will predict the antecedent mention of an anaphora.The system has been trained and tested on Telugu conversation corpus we generated. Given the advantage of the semantic information in word embeddings and appending actor, gender, number, person and part of plural features the model has reached an F1-score of 86.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes