CRAIMay 25, 2019

Transferable Cost-Aware Security Policy Implementation for Malware Detection Using Deep Reinforcement Learning

arXiv:1905.10517v25 citations
Originality Incremental advance
AI Analysis

This addresses cost and efficiency challenges for organizations using malware detection ensembles, offering a transferable and adaptable solution, though it is incremental as it builds on existing reinforcement learning and ensemble methods.

The study tackled the computational cost of ensemble malware detection by proposing SPIREL, a reinforcement learning method that dynamically assigns detectors and thresholds per file, achieving either superior accuracy with modest efficiency gains or an ~80% reduction in runtime with only a 0.5% drop in accuracy and F1-score.

Malware detection is an ever-present challenge for all organizational gatekeepers, who must maintain high detection rates while minimizing interruptions to the organization's workflow. To improve detection rates, organizations often deploy an ensemble of detectors. While effective, this approach is computationally expensive, since every file - even clear-cut cases - needs to be analyzed by all detectors. Moreover, with an ever-increasing number of files to process, the use of ensembles may incur unacceptable processing times and costs (e.g., cloud resources). In this study, we propose SPIREL, a reinforcement learning-based method for cost-effective malware detection. Our method enables organizations to directly associate costs to correct/incorrect classification, computing resources and run-time, and then dynamically establishes a security policy. This security policy is then implemented, and for each inspected file, a different set of detectors is assigned and a different detection threshold is set. Our evaluation on two malware domains- Portable Executable (PE) and Android Application Package (APK)files - shows that SPIREL is both accurate and extremely resource-efficient: the proposed method either outperforms the best performing baselines while achieving a modest improvement in efficiency, or reduces the required running time by ~80% while decreasing the accuracy and F1-score by only 0.5%. We also show that our approach is both highly transferable across different datasets and adaptable to changes in individual detector performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes