CRJul 24, 2019

Anomaly-based Intrusion Detection in Industrial Data with SVM and Random Forests

Simon D. Duque Anton, Sapna Sinha, Hans Dieter Schotten

arXiv:1907.10374v116.396 citationsh-index: 44

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient intrusion detection in industrial control systems, which are vulnerable due to legacy requirements and physical-world impacts, but it is incremental as it applies existing methods to new data.

The paper tackled intrusion detection in industrial networks by applying SVM and Random Forest algorithms to analyze network data from gas pipeline and batch processing traffic, with Random Forest slightly outperforming SVM in detecting attacks.

Attacks on industrial enterprises are increasing in number as well as in effect. Since the introduction of industrial control systems in the 1970's, industrial networks have been the target of malicious actors. More recently, the political and warfare-aspects of attacks on industrial and critical infrastructure are becoming more relevant. In contrast to classic home and office IT systems, industrial IT, so-called OT systems, have an effect on the physical world. Furthermore, industrial devices have long operation times, sometimes several decades. Updates and fixes are tedious and often not possible. The threats on industry with the legacy requirements of industrial environments creates the need for efficient intrusion detection that can be integrated into existing systems. In this work, the network data containing industrial operation is analysed with machine learning- and time series- based anomaly detection algorithms in order to discover the attacks introduced to the data. Two different data sets are used, one Modbus-based gas pipeline control traffic and one OPC UA-based batch processing traffic. In order to detect attacks, two machine learning-based algorithms are used, namely \textit{SVM} and Random Forest. Both perform well, with Random Forest slightly outperforming SVM. Furthermore, extracting and selecting features as well as handling missing data is addressed in this work.

View on arXiv PDF

Similar