Can neural networks understand monotonicity reasoning?
This addresses a key reasoning skill for natural language inference models, but it is incremental as it focuses on a specific dataset and analysis.
The authors tackled the problem of whether neural networks can perform monotonicity reasoning in natural language inference by introducing the Monotonicity Entailment Dataset (MED), and found that state-of-the-art models performed substantially worse, under 55%, especially on downward reasoning.
Monotonicity reasoning is one of the important reasoning skills for any intelligent natural language inference (NLI) model in that it requires the ability to capture the interaction between lexical and syntactic structures. Since no test set has been developed for monotonicity reasoning with wide coverage, it is still unclear whether neural models can perform monotonicity reasoning in a proper way. To investigate this issue, we introduce the Monotonicity Entailment Dataset (MED). Performance by state-of-the-art NLI models on the new test set is substantially worse, under 55%, especially on downward reasoning. In addition, analysis using a monotonicity-driven data augmentation method showed that these models might be limited in their generalization ability in upward and downward reasoning.