LGMLDec 9, 2019

Oversampling Log Messages Using a Sequence Generative Adversarial Network for Anomaly Detection and Classification

arXiv:1912.04747v27 citations
AI Analysis

This addresses data imbalance in log analysis for system monitoring, but it is incremental as it applies existing methods to a specific domain.

The paper tackles the problem of imbalanced log message data for anomaly detection by proposing a SeqGAN-based oversampling model, which increased accuracy on BGL and Openstack datasets.

Dealing with imbalanced data is one of the main challenges in machine/deep learning algorithms for classification. This issue is more important with log message data as it is typically very imbalanced and negative logs are rare. In this paper, a model is proposed to generate text log messages using a SeqGAN network. Then features are extracted using an Autoencoder and anomaly detection is done using a GRU network. The proposed model is evaluated with two imbalanced log data sets, namely BGL and Openstack. Results are presented which show that oversampling and balancing data increases the accuracy of anomaly detection and classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes