CLJul 31, 2022

PASTA: A Dataset for Modeling Participant States in Narratives

Sayontan Ghosh, Mahnaz Koupaee, Isabella Chen, Francis Ferraro, Nathanael Chambers, Niranjan Balasubramanian

arXiv:2208.00329v216.9136 citationsh-index: 29

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of narrative understanding for AI systems, but it is incremental as it primarily introduces a new dataset and benchmarks without a novel method.

The authors tackled the problem of modeling implicit participant states in narratives by introducing the PASTA dataset, which includes inferable states, counterfactual perturbations, and story changes, and found that current LLMs show some reasoning ability but have significant room for improvement, especially in tasks requiring diverse knowledge types.

The events in a narrative are understood as a coherent whole via the underlying states of their participants. Often, these participant states are not explicitly mentioned, instead left to be inferred by the reader. A model that understands narratives should likewise infer these implicit states, and even reason about the impact of changes to these states on the narrative. To facilitate this goal, we introduce a new crowdsourced English-language, Participant States dataset, PASTA. This dataset contains inferable participant states; a counterfactual perturbation to each state; and the changes to the story that would be necessary if the counterfactual were true. We introduce three state-based reasoning tasks that test for the ability to infer when a state is entailed by a story, to revise a story conditioned on a counterfactual state, and to explain the most likely state change given a revised story. Experiments show that today's LLMs can reason about states to some degree, but there is large room for improvement, especially in problems requiring access and ability to reason with diverse types of knowledge (e.g. physical, numerical, factual).

View on arXiv PDF

Similar