SEIRFeb 14, 2022

vue4logs -- Automatic Structuring of Heterogeneous Computer System Logs

arXiv:2202.07504v12 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of understanding unstructured log data for system monitoring and analysis, representing an incremental improvement over existing methods.

The paper tackles the problem of structuring heterogeneous computer system logs by introducing a novel method that uses vector space modeling and similarity grouping to extract event templates, achieving state-of-the-art accuracy and robustness on real-world benchmarks.

Computer system log data is commonly used in system monitoring, performance characteristic investigation, workflow modeling and anomaly detection. Log data is inherently unstructured or semi-structured, which makes it harder to understand the event flow or other important information of a system by reading raw logs. The process of structuring log files first identifies the log message groups based on the system events that triggered them, and extracts an event template to represent the log messages of each event. This paper introduces a novel method to extract event templates from raw system log files, by using the vector space model commonly used in the field of Information Retrieval to vectorize log data and group log messages into event templates based on their vector similarity. Template extraction process is further enhanced with the use of character and length based filters. When evaluated on publicly available real-world log data benchmarks, this proposed method outperforms all the available state-of-the-art systems in terms of accuracy and robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes