CRLGJan 30

PIDSMaker: Building and Evaluating Provenance-based Intrusion Detection Systems

arXiv:2601.22983v22 citationsh-index: 5Has Code
Originality Synthesis-oriented
AI Analysis

This addresses reproducibility and fair comparison issues for researchers in cybersecurity, though it is incremental as it builds on existing systems.

The paper tackles the difficulty in evaluating and comparing provenance-based intrusion detection systems (PIDSs) due to inconsistent protocols, and presents PIDSMaker, an open-source framework that consolidates eight state-of-the-art systems with standardized preprocessing and labels, enabling consistent experiments and apples-to-apples comparisons.

Recent provenance-based intrusion detection systems (PIDSs) have demonstrated strong potential for detecting advanced persistent threats (APTs) by applying machine learning to system provenance graphs. However, evaluating and comparing PIDSs remains difficult: prior work uses inconsistent preprocessing pipelines, non-standard dataset splits, and incompatible ground-truth labeling and metrics. These discrepancies undermine reproducibility, impede fair comparison, and impose substantial re-implementation overhead on researchers. We present PIDSMaker, an open-source framework for developing and evaluating PIDSs under consistent protocols. PIDSMaker consolidates eight state-of-the-art systems into a modular, extensible architecture with standardized preprocessing and ground-truth labels, enabling consistent experiments and apples-to-apples comparisons. A YAML-based configuration interface supports rapid prototyping by composing components across systems without code changes. PIDSMaker also includes utilities for ablation studies, hyperparameter tuning, multi-run instability measurement, and visualization, addressing methodological gaps identified in prior work. We demonstrate PIDSMaker through concrete use cases and release it with preprocessed datasets and labels to support shared evaluation for the PIDS community.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes