CLApr 22, 2019

SocialIQA: Commonsense Reasoning about Social Interactions

Maarten Sap, Hannah Rashkin, Derek Chen, Ronan LeBras, Yejin Choi

arXiv:1904.09728v325.6150 citationsh-index: 77

Originality Incremental advance

AI Analysis

This addresses the problem of evaluating and improving AI's social and emotional intelligence for applications in human-AI interaction, though it is incremental as it builds on existing benchmark methodologies.

The authors tackled the lack of a large-scale benchmark for commonsense reasoning about social interactions by introducing Social IQa, a dataset of 38,000 multiple-choice questions, which shows a >20% performance gap between human and existing models and enables state-of-the-art transfer learning on other commonsense tasks.

We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: "Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?" A: "Make sure no one else could hear"). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).

View on arXiv PDF

Similar