CLCYMar 1, 2022

Advancing an Interdisciplinary Science of Conversation: Insights from a Large Multimodal Corpus of Human Speech

arXiv:2203.00674v112 citationsh-index: 28
Originality Incremental advance
AI Analysis

This work addresses the need for interdisciplinary insights into conversation, which is a fundamental human activity, by providing a public dataset and methods that could benefit fields like psychology, linguistics, and AI, though it is incremental in building on existing literature.

The researchers tackled the problem of limited scientific understanding of conversation by analyzing a large multimodal corpus of 1,656 recorded conversations, extending findings like cooperativeness in turn-taking and applying machine learning to analyze conversational success and well-being.

People spend a substantial portion of their lives engaged in conversation, and yet our scientific understanding of conversation is still in its infancy. In this report we advance an interdisciplinary science of conversation, with findings from a large, novel, multimodal corpus of 1,656 recorded conversations in spoken English. This 7+ million word, 850 hour corpus totals over 1TB of audio, video, and transcripts, with moment-to-moment measures of vocal, facial, and semantic expression, along with an extensive survey of speaker post conversation reflections. We leverage the considerable scope of the corpus to (1) extend key findings from the literature, such as the cooperativeness of human turn-taking; (2) define novel algorithmic procedures for the segmentation of speech into conversational turns; (3) apply machine learning insights across various textual, auditory, and visual features to analyze what makes conversations succeed or fail; and (4) explore how conversations are related to well-being across the lifespan. We also report (5) a comprehensive mixed-method report, based on quantitative analysis and qualitative review of each recording, that showcases how individuals from diverse backgrounds alter their communication patterns and find ways to connect. We conclude with a discussion of how this large-scale public dataset may offer new directions for future research, especially across disciplinary boundaries, as scholars from a variety of fields appear increasingly interested in the study of conversation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes