CLAug 28, 2018

Evaluating Theory of Mind in Question Answering

Aida Nematzadeh, Kaylee Burns, Erin Grant, Alison Gopnik, Thomas L. Griffiths

arXiv:1808.09352v132.61123 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of assessing theory of mind in AI for researchers, but it is incremental as it focuses on dataset creation and evaluation rather than novel model development.

The authors tackled the problem of evaluating question answering models' ability to reason about beliefs by creating a new dataset inspired by theory-of-mind experiments, finding that all tested neural models failed on tasks requiring tracking of inconsistent world states, with accuracy notably decreasing when random sentences were introduced.

We propose a new dataset for evaluating question answering models with respect to their capacity to reason about beliefs. Our tasks are inspired by theory-of-mind experiments that examine whether children are able to reason about the beliefs of others, in particular when those beliefs differ from reality. We evaluate a number of recent neural models with memory augmentation. We find that all fail on our tasks, which require keeping track of inconsistent states of the world; moreover, the models' accuracy decreases notably when random sentences are introduced to the tasks at test.

View on arXiv PDF Code

Similar