CRLGMay 31, 2025

PackHero: A Scalable Graph-based Approach for Efficient Packer Identification

arXiv:2506.00659v12 citationsh-index: 34Has CodeDIMVA
Originality Incremental advance
AI Analysis

This addresses the challenge for malware analysts by providing an efficient and adaptable method for packer identification, though it is incremental as it builds on existing graph matching and clustering techniques.

The paper tackles the problem of packer identification in malware analysis by introducing PackHero, a scalable graph-based approach that achieves a macro-average F1-score of 93.7% with 10 samples per packer and improves to 98.3% with 100 samples, outperforming existing tools in handling virtualization-based packers with 100% recall.

Anti-analysis techniques, particularly packing, challenge malware analysts, making packer identification fundamental. Existing packer identifiers have significant limitations: signature-based methods lack flexibility and struggle against dynamic evasion, while Machine Learning approaches require extensive training data, limiting scalability and adaptability. Consequently, achieving accurate and adaptable packer identification remains an open problem. This paper presents PackHero, a scalable and efficient methodology for identifying packers using a novel static approach. PackHero employs a Graph Matching Network and clustering to match and group Call Graphs from programs packed with known packers. We evaluate our approach on a public dataset of malware and benign samples packed with various packers, demonstrating its effectiveness and scalability across varying sample sizes. PackHero achieves a macro-average F1-score of 93.7% with just 10 samples per packer, improving to 98.3% with 100 samples. Notably, PackHero requires fewer samples to achieve stable performance compared to other Machine Learning-based tools. Overall, PackHero matches the performance of State-of-the-art signature-based tools, outperforming them in handling Virtualization-based packers such as Themida/Winlicense, with a recall of 100%.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes