Figurative Language in Recognizing Textual Entailment
This provides a challenging testbed for evaluating RTE models on figurative language, which is an incremental but domain-specific problem for natural language processing researchers.
The authors tackled the problem of recognizing textual entailment (RTE) with figurative language by creating a dataset of over 12,500 examples from existing sources, and found that state-of-the-art models struggle with pragmatic inference and world knowledge in this context.
We introduce a collection of recognizing textual entailment (RTE) datasets focused on figurative language. We leverage five existing datasets annotated for a variety of figurative language -- simile, metaphor, and irony -- and frame them into over 12,500 RTE examples.We evaluate how well state-of-the-art models trained on popular RTE datasets capture different aspects of figurative language. Our results and analyses indicate that these models might not sufficiently capture figurative language, struggling to perform pragmatic inference and reasoning about world knowledge. Ultimately, our datasets provide a challenging testbed for evaluating RTE models.