Log Optimization Simplification Method for Predicting Remaining Time
This work addresses efficiency in event log simplification for performance predictions in information systems, representing an incremental improvement over existing methods.
The paper tackles the problem of low-value and redundant information in event log data compromising prediction accuracy by introducing a prediction point selection algorithm that avoids simplifying all similarly functioning points, optimizing deviation to prevent over-simplification. Experiments show the simplified event log retains or enhances predictive accuracy compared to the original.
Information systems generate a large volume of event log data during business operations, much of which consists of low-value and redundant information. When performance predictions are made directly from these logs, the accuracy of the predictions can be compromised. Researchers have explored methods to simplify and compress these data while preserving their valuable components. Most existing approaches focus on reducing the dimensionality of the data by eliminating redundant and irrelevant features. However, there has been limited investigation into the efficiency of execution both before and after event log simplification. In this paper, we present a prediction point selection algorithm designed to avoid the simplification of all points that function similarly. We select sequences or self-loop structures to form a simplifiable segment, and we optimize the deviation between the actual simplifiable value and the original data prediction value to prevent over-simplification. Experiments indicate that the simplified event log retains its predictive performance and, in some cases, enhances its predictive accuracy compared to the original event log.