NEMay 16, 2019

Building an Effective Intrusion Detection System using Unsupervised Feature Selection in Multi-objective Optimization Framework

arXiv:1905.06562v113 citations
Originality Synthesis-oriented
AI Analysis

This work addresses network security by improving intrusion detection, but it is incremental as it applies existing methods like NSGA-II and standard classifiers to feature selection.

The paper tackles the problem of building an intrusion detection system by proposing an unsupervised feature selection technique using a multi-objective optimization framework, achieving high accuracy rates such as 99.78% on the KDD-99 dataset.

Intrusion Detection Systems (IDS) are developed to protect the network by detecting the attack. The current paper proposes an unsupervised feature selection technique for analyzing the network data. The search capability of the non-dominated sorting genetic algorithm (NSGA-II) has been employed for optimizing three different objective functions utilizing different information theoretic measures including mutual information, standard deviation, and information gain to identify mutually exclusive and a high variant subset of features. Finally, the Pareto optimal front of the different optimal feature subsets are obtained and these feature subsets are utilized for developing classification systems using different popular machine learning models like support vector machines, decision trees and k-nearest neighbour (k=5) classifier etc. We have evaluated the results of the algorithm on KDD-99, NSL-KDD and Kyoto 2006+ datasets. The experimental results on KDD-99 dataset show that decision tree provides better results than other available classifiers. The proposed system obtains the best results of 99.78% accuracy, 99.27% detection rate and false alarm rate of 0.2%, which are better than all the previous results for KDD dataset. We achieved an accuracy of 99.83% for 20% testing data of NSL-KDD dataset and 99.65% accuracy for 10-fold cross-validation on Kyoto dataset. The most attractive characteristic of the proposed scheme is that during the selection of appropriate feature subset, no labeled information is utilized and different feature quality measures are optimized simultaneously using the multi-objective optimization framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes