Exploring Transitivity in Neural NLI Models through Veridicality
This addresses the issue of generalization capacity in NLP models for researchers, but it is incremental as it builds on existing analysis methods.
The paper tackled the problem of whether neural NLI models can demonstrate human-like generalization by evaluating their ability to perform transitivity inferences, finding that current models do not perform consistently well on these tasks.
Despite the recent success of deep neural networks in natural language processing, the extent to which they can demonstrate human-like generalization capacities for natural language understanding remains unclear. We explore this issue in the domain of natural language inference (NLI), focusing on the transitivity of inference relations, a fundamental property for systematically drawing inferences. A model capturing transitivity can compose basic inference patterns and draw new inferences. We introduce an analysis method using synthetic and naturalistic NLI datasets involving clause-embedding verbs to evaluate whether models can perform transitivity inferences composed of veridical inferences and arbitrary inference types. We find that current NLI models do not perform consistently well on transitivity inference tasks, suggesting that they lack the generalization capacity for drawing composite inferences from provided training examples. The data and code for our analysis are publicly available at https://github.com/verypluming/transitivity.