CRAILGSep 18, 2023

Efficient Concept Drift Handling for Batch Android Malware Detection Models

arXiv:2309.09807v117 citationsh-index: 24
Originality Incremental advance
AI Analysis

This addresses the challenge of model obsolescence for Android malware detection systems, offering incremental improvements in efficiency.

The paper tackles the problem of concept drift in static batch machine learning models for Android malware detection, showing that retraining techniques with concept drift detection and sample selection maintain detector performance over time, achieving efficient strategies with reduced data usage.

The rapidly evolving nature of Android apps poses a significant challenge to static batch machine learning algorithms employed in malware detection systems, as they quickly become obsolete. Despite this challenge, the existing literature pays limited attention to addressing this issue, with many advanced Android malware detection approaches, such as Drebin, DroidDet and MaMaDroid, relying on static models. In this work, we show how retraining techniques are able to maintain detector capabilities over time. Particularly, we analyze the effect of two aspects in the efficiency and performance of the detectors: 1) the frequency with which the models are retrained, and 2) the data used for retraining. In the first experiment, we compare periodic retraining with a more advanced concept drift detection method that triggers retraining only when necessary. In the second experiment, we analyze sampling methods to reduce the amount of data used to retrain models. Specifically, we compare fixed sized windows of recent data and state-of-the-art active learning methods that select those apps that help keep the training dataset small but diverse. Our experiments show that concept drift detection and sample selection mechanisms result in very efficient retraining strategies which can be successfully used to maintain the performance of the static Android malware state-of-the-art detectors in changing environments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes