Anomaly Detection in Human Language via Meta-Learning: A Few-Shot Approach
This addresses the challenge of sparse and variable anomalies in language for applications like content moderation, though it is incremental in combining existing techniques.
The paper tackles the problem of detecting anomalies in human language, such as spam and fake news, with limited labeled data by using a meta-learning framework, achieving improved F1 and AUC scores compared to baselines.
We propose a meta learning framework for detecting anomalies in human language across diverse domains with limited labeled data. Anomalies in language ranging from spam and fake news to hate speech pose a major challenge due to their sparsity and variability. We treat anomaly detection as a few shot binary classification problem and leverage meta-learning to train models that generalize across tasks. Using datasets from domains such as SMS spam, COVID-19 fake news, and hate speech, we evaluate model generalization on unseen tasks with minimal labeled anomalies. Our method combines episodic training with prototypical networks and domain resampling to adapt quickly to new anomaly detection tasks. Empirical results show that our method outperforms strong baselines in F1 and AUC scores. We also release the code and benchmarks to facilitate further research in few-shot text anomaly detection.