Negated Complementary Commonsense using Large Language Models
This addresses a specific vulnerability in large language models for AI researchers, though it is incremental as it focuses on a narrow scenario.
The paper tackles the problem of large language models struggling with negated complementary commonsense questions, proposing a model-agnostic method that improves performance by over 11 points compared to GPT-3 few-shot generation.
Larger language models, such as GPT-3, have shown to be excellent in many tasks. However, we demonstrate that out-of-ordinary questions can throw the model off guard. This work focuses on finding answers to negated complementary questions in commonsense scenarios. We illustrate how such questions adversely affect the model responses. We propose a model-agnostic methodology to improve the performance in negated complementary scenarios. Our method outperforms few-shot generation from GPT-3 (by more than 11 points) and, more importantly, highlights the significance of studying the response of large language models in negated complementary questions. The code, data, and experiments are available under: https://github.com/navidre/negated_complementary_commonsense.