Deep Anomaly Detection in Text
This work addresses anomaly detection in natural language processing, offering a novel approach that is incremental by adapting self-supervised techniques from computer vision to text.
The paper tackled the problem of anomaly detection in text by developing a method using self-supervised learning with pretext tasks tailored for text corpora, resulting in state-of-the-art improvements on the 20Newsgroups and AG News datasets for both semi-supervised and unsupervised settings.
Deep anomaly detection methods have become increasingly popular in recent years, with methods like Stacked Autoencoders, Variational Autoencoders, and Generative Adversarial Networks greatly improving the state-of-the-art. Other methods rely on augmenting classical models (such as the One-Class Support Vector Machine), by learning an appropriate kernel function using Neural Networks. Recent developments in representation learning by self-supervision are proving to be very beneficial in the context of anomaly detection. Inspired by the advancements in anomaly detection using self-supervised learning in the field of computer vision, this thesis aims to develop a method for detecting anomalies by exploiting pretext tasks tailored for text corpora. This approach greatly improves the state-of-the-art on two datasets, 20Newsgroups, and AG News, for both semi-supervised and unsupervised anomaly detection, thus proving the potential for self-supervised anomaly detectors in the field of natural language processing.