SEJan 24, 2020

Software Logging for Machine Learning

arXiv:2001.10794v1
Originality Incremental advance
AI Analysis

This addresses the challenge of inefficient log management for analytics in large software systems, though it is incremental as it builds on existing logging practices.

The paper tackles the problem of ad-hoc and unstructured system logs in software-intensive systems, which limits their usefulness for analytics and machine learning, by proposing a systematic and structured approach for generating log data optimized for ML, validated through expert interviews at a telecommunications company.

System logs perform a critical function in software-intensive systems as logs record the state of the system and significant events in the system at important points in time. Unfortunately, log entries are typically created in an ad-hoc, unstructured and uncoordinated fashion, limiting their usefulness for analytics and machine learning. In this paper, we present the main challenges of contemporary approaches to generating and storing system logs data for large, complex, software-intensive systems based on an in-depth case study at a world-leading telecommunications company. Second, we present a systematic and structured approach for generating log data that does not suffer from the aforementioned challenges and is optimized for use in machine learning. Third, we provide validation of the approach based on expert interviews that confirms that the approach addresses the identified challenges and problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes