CLAIApr 29, 2020

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

arXiv:2004.14069v21008 citations
AI Analysis

This addresses the challenge of low-resource language MRC by enhancing transfer from rich source languages, though it is incremental as it builds on existing pre-trained models.

The paper tackles the problem of poor answer boundary detection in multilingual machine reading comprehension by proposing two auxiliary tasks for fine-tuning, resulting in improved performance on cross-lingual datasets.

Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages. However, the transfer quality for multilingual Machine Reading Comprehension (MRC) is significantly worse than sentence classification tasks mainly due to the requirement of MRC to detect the word level answer boundary. In this paper, we propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision: (1) A mixed MRC task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs; (2) A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web. Besides, extensive experiments on two cross-lingual MRC datasets show the effectiveness of our proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes