Efficient Network Traffic Feature Sets for IoT Intrusion Detection
This work addresses computational efficiency for IoT cybersecurity, but it is incremental as it applies existing feature selection methods to new data.
The paper tackled the problem of improving computational efficiency in IoT intrusion detection by evaluating feature selection methods on multiple IoT network datasets, resulting in ML models achieving higher computational efficiency with little to no difference in generalization.
The use of Machine Learning (ML) models in cybersecurity solutions requires high-quality data that is stripped of redundant, missing, and noisy information. By selecting the most relevant features, data integrity and model efficiency can be significantly improved. This work evaluates the feature sets provided by a combination of different feature selection methods, namely Information Gain, Chi-Squared Test, Recursive Feature Elimination, Mean Absolute Deviation, and Dispersion Ratio, in multiple IoT network datasets. The influence of the smaller feature sets on both the classification performance and the training time of ML models is compared, with the aim of increasing the computational efficiency of IoT intrusion detection. Overall, the most impactful features of each dataset were identified, and the ML models obtained higher computational efficiency while preserving a good generalization, showing little to no difference between the sets.