Behavior Query Discovery in System-Generated Temporal Graphs
This addresses the problem for system administrators of manually formulating queries to detect abnormal activities and risks in computer systems, representing a domain-specific incremental improvement.
The paper tackles the challenge of querying complex system monitoring logs by introducing TGMiner, a method that mines discriminative temporal graph patterns from logs, resulting in patterns that are 6-32 times faster to generate than baselines and achieve 97% precision and 91% recall.
Computer system monitoring generates huge amounts of logs that record the interaction of system entities. How to query such data to better understand system behaviors and identify potential system risks and malicious behaviors becomes a challenging task for system administrators due to the dynamics and heterogeneity of the data. System monitoring data are essentially heterogeneous temporal graphs with nodes being system entities and edges being their interactions over time. Given the complexity of such graphs, it becomes time-consuming for system administrators to manually formulate useful queries in order to examine abnormal activities, attacks, and vulnerabilities in computer systems. In this work, we investigate how to query temporal graphs and treat query formulation as a discriminative temporal graph pattern mining problem. We introduce TGMiner to mine discriminative patterns from system logs, and these patterns can be taken as templates for building more complex queries. TGMiner leverages temporal information in graphs to prune graph patterns that share similar growth trend without compromising pattern quality. Experimental results on real system data show that TGMiner is 6-32 times faster than baseline methods. The discovered patterns were verified by system experts; they achieved high precision (97%) and recall (91%).