CLJun 4

Forgive or forget: Understanding the context of hate in audio retrieval systems

Arghya Pal, Sailaja Rajanala, Raphael C. -W. Phan, Shekhar Nayak

arXiv:2606.0585713.5

Predicted impact top 72% in CL · last 90 daysOriginality Incremental advance

AI Analysis

For developers of audio retrieval systems, this work addresses the challenge of handling toxic content without compromising retrieval quality, though the improvements are incremental.

The paper tackles toxic retrieval in text-to-audio systems by proposing a post hoc causal debiasing framework with a sentiment-controlled mediator that reduces toxicity while preserving semantic relevance. Experiments show consistent toxicity reduction with minimal loss in retrieval accuracy.

Handling toxic retrieval in text-to-audio systems is challenging due to contextual dependencies. Existing strategies (e.g., rephrasing, summarization) risk altering intent or omitting details. We propose a post hoc causal debiasing framework with a sentiment-controlled mediator to preserve semantic relevance while suppressing harmful speech. Our approach is model-agnostic and integrates seamlessly with existing retrieval pipelines. We introduce two variants: Forgive, which re-ranks and filters toxic audio via logit adjustment, and Forget, which generates counterfactual toxic prompts to mitigate harmful retrievals. Experiments show consistent toxicity reduction with minimal loss in retrieval accuracy, improving both safety and reliability.

View on arXiv PDF

Similar