CLDec 2, 2024

Impromptu Cybercrime Euphemism Detection

arXiv:2412.01413v221 citationsh-index: 21COLING
Originality Incremental advance
AI Analysis

This addresses content security for social media platforms by detecting impromptu euphemisms, but it is incremental as it builds on existing euphemism detection methods.

The paper tackles the problem of detecting impromptu euphemisms for cybercrime on social media, introducing the ICED dataset and a detection framework with context augmentation and multi-round iterative training, achieving a 76-fold improvement over previous state-of-the-art methods.

Detecting euphemisms is essential for content security on various social media platforms, but existing methods designed for detecting euphemisms are ineffective in impromptu euphemisms. In this work, we make a first attempt to an exploration of impromptu euphemism detection and introduce the Impromptu Cybercrime Euphemisms Detection (ICED) dataset. Moreover, we propose a detection framework tailored to this problem, which employs context augmentation modeling and multi-round iterative training. Our detection framework mainly consists of a coarse-grained and a fine-grained classification model. The coarse-grained classification model removes most of the harmless content in the corpus to be detected. The fine-grained model, impromptu euphemisms detector, integrates context augmentation and multi-round iterations training to better predicts the actual meaning of a masked token. In addition, we leverage ChatGPT to evaluate the mode's capability. Experimental results demonstrate that our approach achieves a remarkable 76-fold improvement compared to the previous state-of-the-art euphemism detector.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes