CLOct 16, 2019

Why can't memory networks read effectively?

Simon Šuster, Madhumita Sushil, Walter Daelemans

arXiv:1910.07350v10.2

Originality Incremental advance

AI Analysis

This work addresses a fundamental limitation in memory networks for machine reading comprehension, which is incremental as it builds on prior findings about multi-hop reasoning issues.

The paper tackled the problem of vanilla memory networks being ineffective in single-hop reading comprehension, finding that entity-specific classification weights and flat attention distributions are key contributors to poor performance, and proposed simple network adaptations as remedies.

Memory networks have been a popular choice among neural architectures for machine reading comprehension and question answering. While recent work revealed that memory networks can't truly perform multi-hop reasoning, we show in the present paper that vanilla memory networks are ineffective even in single-hop reading comprehension. We analyze the reasons for this on two cloze-style datasets, one from the medical domain and another including children's fiction. We find that the output classification layer with entity-specific weights, and the aggregation of passage information with relatively flat attention distributions are the most important contributors to poor results. We propose network adaptations that can serve as simple remedies. We also find that the presence of unseen answers at test time can dramatically affect the reported results, so we suggest controlling for this factor during evaluation.

View on arXiv PDF

Similar