SDCLApr 10

Few-Shot Contrastive Adaptation for Audio Abuse Detection in Low-Resource Indic Languages

arXiv:2604.0909450.6h-index: 27
AI Analysis

This addresses the problem of detecting abusive speech in multilingual, low-resource audio contexts, offering a method that avoids transcription errors, though it is incremental as it builds on existing CLAP models.

The paper tackled abusive speech detection directly from audio in low-resource Indic languages by using Contrastive Language-Audio Pre-training (CLAP) with few-shot adaptation, achieving competitive performance with fully supervised systems in cross-lingual settings.

Abusive speech detection is becoming increasingly important as social media shifts towards voice-based interaction, particularly in multilingual and low-resource settings. Most current systems rely on automatic speech recognition (ASR) followed by text-based hate speech classification, but this pipeline is vulnerable to transcription errors and discards prosodic information carried in speech. We investigate whether Contrastive Language-Audio Pre-training (CLAP) can support abusive speech detection directly from audio. Using the ADIMA dataset, we evaluate CLAP-based representations under few-shot supervised contrastive adaptation in cross-lingual and leave-one-language-out settings, with zero-shot prompting included as an auxiliary analysis. Our results show that CLAP yields strong cross-lingual audio representations across ten Indic languages, and that lightweight projection-only adaptation achieves competitive performance with respect to fully supervised systems trained on complete training data. However, the benefits of few-shot adaptation are language-dependent and not monotonic with shot size. These findings suggest that contrastive audio-text models provide a promising basis for cross-lingual audio abuse detection in low-resource settings, while also indicating that transfer remains incomplete and language-specific in important ways.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes