SEDec 22, 2021
Log severity level classification: an approach for systems in productionEduardo Mendes, Fabio Petrillo
Context: Logs are often the primary source of information for system developers and operations engineers to understand and diagnose the behavior of a software system in production. In many cases, logs are the only evidence available for fault investigation. Problem: However, the inappropriate choice of log severity level can impact the amount of log data generated and, consequently, quality. This storage overhead can impact the performance of log-based monitoring systems, as excess log data comes with increased aggregate noise, making it challenging to utilize what is actually important when trying to do diagnostics. Goal: This research aims to decrease the overheads of monitoring systems by processing the severity level of log data from systems in production. Approach: To achieve this goal, we intend to deepen the knowledge about the log severity levels and develop an automated approach to log severity level classification, demonstrating that reducing log severity level "noise" improves the monitoring of systems in production. Conclusion: We hope that the set of contributions from this work can improve the monitoring activities of software systems and contribute to the creation of knowledge that improves logging practices
SESep 2, 2021
Log severity levels matter: A multivocal mappingEduardo Mendes, Fabio Petrillo
The choice of log severity level can be challenging and cause problems in producing reliable logging data. However, there is a lack of specifications and practical guidelines to support this challenge. In this study, we present a multivocal systematic mapping of log severity levels from peer-reviewed literature, logging libraries, and practitioners' views. We analyzed 19 severity levels, 27 studies, and 40 logging libraries. Our results show redundancy and semantic similarity between the levels and a tendency to converge the levels for a total of six levels. Our contributions help leverage the reliability of log entries: (i) mapping the literature about log severity levels, (ii) mapping the severity levels in logging libraries, (iii) a set of synthesized six definitions and four general purposes for severity levels. We recommend that developers use a standard nomenclature, and for logging library creators, we suggest providing accurate and unambiguous definitions of log severity levels.
SEJun 6, 2021
Towards Logging Noisiness Theory: quality aspects to characterize unwanted log entriesEduardo Mendes, Fabio Petrillo
Context: Logging tasks track the system's functioning by keeping records of evidence that have been analyzed by monitoring and observability activities. For these activities to be effective, it is necessary to consider the quality of the consumed information. Problem: However, the presence of noise - unwanted information - compromises the log files' quality. The noisiness of a log file can be affected among other things by: (i) the wrong severity log choices, (ii) the production of duplicate entries, (iii) the incompleteness of the information, (iv) the inappropriate format of the entries, (v) the amount of information generated. Objective: This work aims to broadly define the concept of noise in the context of logging, proposing the initial steps of Logging Noisiness, a theory on quality aspects to characterize unwanted log entries.