CLAIFeb 16, 2024

Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives

arXiv:2402.11051v135 citationsh-index: 11ACL
Originality Synthesis-oriented
AI Analysis

This addresses the gap in narrative understanding for AI researchers, but it is incremental as it focuses on a specific domain and dataset.

The authors tackled the problem of LLMs' limitations in understanding complex relationships in narratives by introducing the Conan benchmark for extracting character relation graphs from detective stories, revealing that models like GPT-3.5, GPT-4, and Llama2 struggle with inferencing and longer narratives.

Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios. To address this gap, we introduce a new benchmark, Conan, designed for extracting and analysing intricate character relation graphs from detective narratives. Specifically, we designed hierarchical relationship categories and manually extracted and annotated role-oriented relationships from the perspectives of various characters, incorporating both public relationships known to most characters and secret ones known to only a few. Our experiments with advanced Large Language Models (LLMs) like GPT-3.5, GPT-4, and Llama2 reveal their limitations in inferencing complex relationships and handling longer narratives. The combination of the Conan dataset and our pipeline strategy is geared towards understanding the ability of LLMs to comprehend nuanced relational dynamics in narrative contexts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes