CRDCNov 23, 2016

A Novel Control-flow based Intrusion Detection Technique for Big Data Systems

arXiv:1611.07649v13 citations
Originality Incremental advance
AI Analysis

This addresses security vulnerabilities in big data infrastructure, though it is incremental as it builds on existing intrusion detection concepts.

The paper tackles the problem of immature security features in big data systems by proposing a novel intrusion detection technique that identifies program-level anomalies using control-flow graphs and matching methods, resulting in only 0.8% overhead in real-time Hadoop MapReduce tests.

Security and distributed infrastructure are two of the most common requirements for big data software. But the security features of the big data platforms are still premature. It is critical to identify, modify, test and execute some of the existing security mechanisms before using them in the big data world. In this paper, we propose a novel intrusion detection technique that understands and works according to the needs of big data systems. Our proposed technique identifies program level anomalies using two methods - a profiling method that models application behavior by creating process signatures from control-flow graphs; and a matching method that checks for coherence among the replica nodes of a big data system by matching the process signatures. The profiling method creates a process signature by reducing the control-flow graph of a process to a set of minimum spanning trees and then creates a hash of that set. The matching method first checks for similarity in process behavior by matching the received process signature with the local signature and then shares the result with all replica datanodes for consensus. Experimental results show only 0.8% overhead due to the proposed technique when tested on the hadoop map-reduce examples in real-time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes