CL LGJun 10, 2024

Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles

Julia Kruk, Michela Marchini, Rijul Magu, Caleb Ziems, David Muchlinski, Diyi Yang

arXiv:2406.06840v214.928 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of identifying subtle hate speech and discrimination in social media and political discourse, with applications in hate speech detection, neology, and political science, though it is incremental as it applies existing LLM methods to a new domain.

The paper tackles the problem of detecting coded dog whistles in communication by using Large Language Models for word-sense disambiguation, resulting in the creation of a dataset of 16,550 high-confidence examples, which is the largest such dataset available.

A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination. Dog whistling historically originated from United States politics, but in recent years has taken root in social media as a means of evading hate speech detection systems and maintaining plausible deniability. In this paper, we present an approach for word-sense disambiguation of dog whistles from standard speech using Large Language Models (LLMs), and leverage this technique to create a dataset of 16,550 high-confidence coded examples of dog whistles used in formal and informal communication. Silent Signals is the largest dataset of disambiguated dog whistle usage, created for applications in hate speech detection, neology, and political science. The dataset can be found at https://huggingface.co/datasets/SALT-NLP/silent_signals.

View on arXiv PDF

Similar