Quantifying Intimacy in Language
This work addresses the challenge of measuring intimacy in language for researchers in computational linguistics and social psychology, though it is incremental as it builds on existing social psychology findings.
The authors tackled the problem of quantifying intimacy in language by introducing a computational framework with a dataset and deep learning model that predicts intimacy levels in questions with a Pearson's r of 0.87, and they analyzed 80.5 million questions to show how individuals adjust intimacy based on social factors like gender and audience.
Intimacy is a fundamental aspect of how we relate to others in social settings. Language encodes the social information of intimacy through both topics and other more subtle cues (such as linguistic hedging and swearing). Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson's r=0.87). Through analyzing a dataset of 80.5M questions across social media, books, and films, we show that individuals employ interpersonal pragmatic moves in their language to align their intimacy with social settings. Then, in three studies, we further demonstrate how individuals modulate their intimacy to match social norms around gender, social distance, and audience, each validating key findings from studies in social psychology. Our work demonstrates that intimacy is a pervasive and impactful social dimension of language.