Intrusion Detection: Machine Learning Baseline Calculations for Image Classification
This is an incremental study for cybersecurity, showing that image-based methods for malware detection are not yet effective with current techniques.
The paper tackled intrusion detection by converting network attack data into images and applying machine learning, but most models failed to achieve over 80% accuracy and had low F1 scores, indicating limited success.
Cyber security can be enhanced through application of machine learning by recasting network attack data into an image format, then applying supervised computer vision and other machine learning techniques to detect malicious specimens. Exploratory data analysis reveals little correlation and few distinguishing characteristics between the ten classes of malware used in this study. A general model comparison demonstrates that the most promising candidates for consideration are Light Gradient Boosting Machine, Random Forest Classifier, and Extra Trees Classifier. Convolutional networks fail to deliver their outstanding classification ability, being surpassed by a simple, fully connected architecture. Most tests fail to break 80% categorical accuracy and present low F1 scores, indicating more sophisticated approaches (e.g., bootstrapping, random samples, and feature selection) may be required to maximize performance.