CLSep 24, 2023

Does the "most sinfully decadent cake ever" taste good? Answering Yes/No Questions from Figurative Contexts

arXiv:2309.13748v1134 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses the challenge of making QA models more robust for users dealing with figurative language, though it is incremental as it builds on existing models and datasets.

The paper tackled the problem of Question Answering (QA) models struggling with figurative language by introducing FigurativeQA, a dataset of 1000 yes/no questions, and found that BERT-based models show a 15% performance drop on figurative contexts, while ChatGPT with chain-of-thought prompting achieves the best results by simplifying figurative to literal contexts.

Figurative language is commonplace in natural language, and while making communication memorable and creative, can be difficult to understand. In this work, we investigate the robustness of Question Answering (QA) models on figurative text. Yes/no questions, in particular, are a useful probe of figurative language understanding capabilities of large language models. We propose FigurativeQA, a set of 1000 yes/no questions with figurative and non-figurative contexts, extracted from the domains of restaurant and product reviews. We show that state-of-the-art BERT-based QA models exhibit an average performance drop of up to 15\% points when answering questions from figurative contexts, as compared to non-figurative ones. While models like GPT-3 and ChatGPT are better at handling figurative texts, we show that further performance gains can be achieved by automatically simplifying the figurative contexts into their non-figurative (literal) counterparts. We find that the best overall model is ChatGPT with chain-of-thought prompting to generate non-figurative contexts. Our work provides a promising direction for building more robust QA models with figurative language understanding capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes