Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats
This addresses the challenge of detecting coded political expressions for content moderation, though it is incremental as it builds on existing methods.
The paper tackles the problem of identifying novel dog whistles in social media, where current systems fail, and presents EarShot, a baseline system that combines vector databases and LLMs to efficiently find them.
WARNING: This paper contains content that maybe upsetting or offensive to some readers. Dog whistles are coded expressions with dual meanings: one intended for the general public (outgroup) and another that conveys a specific message to an intended audience (ingroup). Often, these expressions are used to convey controversial political opinions while maintaining plausible deniability and slip by content moderation filters. Identification of dog whistles relies on curated lexicons, which have trouble keeping up to date. We introduce FETCH!, a task for finding novel dog whistles in massive social media corpora. We find that state-of-the-art systems fail to achieve meaningful results across three distinct social media case studies. We present EarShot, a strong baseline system that combines the strengths of vector databases and Large Language Models (LLMs) to efficiently and effectively identify new dog whistles.