LGAICRSep 18, 2025

BEACON: Behavioral Malware Classification with Large Language Model Embeddings and Deep Learning

arXiv:2509.14519v11 citations
Originality Incremental advance
AI Analysis

This addresses the problem of complex malware evasion for cybersecurity practitioners, offering an incremental improvement in detection accuracy.

The paper tackles malware detection by proposing BEACON, a deep learning framework that uses large language model embeddings from behavioral reports, achieving consistent outperformance over existing methods on the Avast-CTU Public CAPE Dataset.

Malware is becoming increasingly complex and widespread, making it essential to develop more effective and timely detection methods. Traditional static analysis often fails to defend against modern threats that employ code obfuscation, polymorphism, and other evasion techniques. In contrast, behavioral malware detection, which monitors runtime activities, provides a more reliable and context-aware solution. In this work, we propose BEACON, a novel deep learning framework that leverages large language models (LLMs) to generate dense, contextual embeddings from raw sandbox-generated behavior reports. These embeddings capture semantic and structural patterns of each sample and are processed by a one-dimensional convolutional neural network (1D CNN) for multi-class malware classification. Evaluated on the Avast-CTU Public CAPE Dataset, our framework consistently outperforms existing methods, highlighting the effectiveness of LLM-based behavioral embeddings and the overall design of BEACON for robust malware classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes