CRAIFeb 26, 2025

Poster: Long PHP webshell files detection based on sliding window attention

arXiv:2502.19257v22 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses webshell detection for web application security, but it is incremental as it builds on existing deep learning methods with a novel attention mechanism for long files.

The paper tackles detecting long PHP webshell files by proposing a method that converts PHP to opcodes, extracts features using CodeBert and FastText, and incorporates a sliding window attention mechanism to capture malicious behavior in long files, achieving high accuracy.

Webshell is a type of backdoor, and web applications are widely exposed to webshell injection attacks. Therefore, it is important to study webshell detection techniques. In this study, we propose a webshell detection method. We first convert PHP source code to opcodes and then extract Opcode Double-Tuples (ODTs). Next, we combine CodeBert and FastText models for feature representation and classification. To address the challenge that deep learning methods have difficulty detecting long webshell files, we introduce a sliding window attention mechanism. This approach effectively captures malicious behavior within long files. Experimental results show that our method reaches high accuracy in webshell detection, solving the problem of traditional methods that struggle to address new webshell variants and anti-detection techniques.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes