SCADS: A Scalable Approach Using Spark in Cloud for Host-based Intrusion Detection System with System Calls
This work addresses scalability issues in intrusion detection for data center administrators, but it is incremental as it applies existing big data tools to a known problem.
The authors tackled the challenge of scaling host-based intrusion detection systems for large-scale system call traces in data centers by proposing SCADS, a solution using Apache Spark in the cloud, which enhanced detection efficiency.
Following the current big data trend, the scale of real-time system call traces generated by Linux applications in a contemporary data center may increase excessively. Due to the deficiency of scalability, it is challenging for traditional host-based intrusion detection systems deployed on every single host to collect, maintain, and manipulate those large-scale accumulated system call traces. It is inflexible to build data mining models on one physical host that has static computing capability and limited storage capacity. To address this issue, we propose SCADS, a corresponding solution using Apache Spark in the Google cloud environment. A set of Spark algorithms are developed to achieve the computational scalability. The experiment results demonstrate that the efficiency of intrusion detection can be enhanced, which indicates that the proposed method can apply to the design of next-generation host-based intrusion detection systems with system calls.