CLMay 26, 2023

Dramatic Conversation Disentanglement

arXiv:2305.16648v1223 citations
Originality Synthesis-oriented
AI Analysis

This work addresses conversation disentanglement for multi-party interactions in dramatic media, providing a dataset and analysis that is incremental to existing IRC-focused research.

The authors tackled conversation disentanglement in movies and TV series by creating a new dataset of 10,033 dialogue turns from 831 movies, annotated into 2,209 threads, and applied models to analyze 808 movies, finding that average thread lengths did not decrease over 40 years and female characters initiated more threads relative to speaking time.

We present a new dataset for studying conversation disentanglement in movies and TV series. While previous work has focused on conversation disentanglement in IRC chatroom dialogues, movies and TV shows provide a space for studying complex pragmatic patterns of floor and topic change in face-to-face multi-party interactions. In this work, we draw on theoretical research in sociolinguistics, sociology, and film studies to operationalize a conversational thread (including the notion of a floor change) in dramatic texts, and use that definition to annotate a dataset of 10,033 dialogue turns (comprising 2,209 threads) from 831 movies. We compare the performance of several disentanglement models on this dramatic dataset, and apply the best-performing model to disentangle 808 movies. We see that, contrary to expectation, average thread lengths do not decrease significantly over the past 40 years, and characters portrayed by actors who are women, while underrepresented, initiate more new conversational threads relative to their speaking time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes