CRAIJul 27, 2021

PDF-Malware: An Overview on Threats, Detection and Evasion Attacks

arXiv:2107.12873v11 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental overview paper addressing PDF malware detection for information security practitioners and researchers.

The paper provides an overview of the PDF-malware detection problem, highlighting the increasing threats and the evolution of detection techniques, including machine learning methods and evasion attacks.

In the recent years, Portable Document Format, commonly known as PDF, has become a democratized standard for document exchange and dissemination. This trend has been due to its characteristics such as its flexibility and portability across platforms. The widespread use of PDF has installed a false impression of inherent safety among benign users. However, the characteristics of PDF motivated hackers to exploit various types of vulnerabilities, overcome security safeguards, thereby making the PDF format one of the most efficient malicious code attack vectors. Therefore, efficiently detecting malicious PDF files is crucial for information security. Several analysis techniques has been proposed in the literature, be it static or dynamic, to extract the main features that allow the discrimination of malware files from benign ones. Since classical analysis techniques may be limited in case of zero-days, machine-learning based techniques have emerged recently as an automatic PDF-malware detection method that is able to generalize from a set of training samples. These techniques are themselves facing the challenge of evasion attacks where a malicious PDF is transformed to look benign. In this work, we give an overview on the PDF-malware detection problem. We give a perspective on the new challenges and emerging solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes