CRLGNov 8, 2021

threaTrace: Detecting and Tracing Host-based Threats in Node Level Through Provenance Graph Learning

arXiv:2111.04333v1170 citations
Originality Highly original
AI Analysis

This addresses the challenge of low detection performance for stealthy threats in host-based intrusion detection, offering a scalable real-time solution for monitoring long-term running hosts.

The paper tackles the problem of detecting stealthy host-based threats like malware and APTs by proposing threaTrace, an anomaly-based detector that operates at the system entity level using provenance graph learning, and it outperforms three state-of-the-art host intrusion detection systems on three public datasets.

Host-based threats such as Program Attack, Malware Implantation, and Advanced Persistent Threats (APT), are commonly adopted by modern attackers. Recent studies propose leveraging the rich contextual information in data provenance to detect threats in a host. Data provenance is a directed acyclic graph constructed from system audit data. Nodes in a provenance graph represent system entities (e.g., $processes$ and $files$) and edges represent system calls in the direction of information flow. However, previous studies, which extract features of the whole provenance graph, are not sensitive to the small number of threat-related entities and thus result in low performance when hunting stealthy threats. We present threaTrace, an anomaly-based detector that detects host-based threats at system entity level without prior knowledge of attack patterns. We tailor GraphSAGE, an inductive graph neural network, to learn every benign entity's role in a provenance graph. threaTrace is a real-time system, which is scalable of monitoring a long-term running host and capable of detecting host-based intrusion in their early phase. We evaluate threaTrace on three public datasets. The results show that threaTrace outperforms three state-of-the-art host intrusion detection systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes