CR AIAug 25, 2025

Learning from Few Samples: A Novel Approach for High-Quality Malcode Generation

Haijian Ma, Daizong Liu, Xiaowen Cai, Pan Zhou, Yulai Xie

arXiv:2508.18148v11 citationsh-index: 27EMNLP

Originality Incremental advance

AI Analysis

This addresses the challenge of training adaptive defense systems against evolving cyber threats, but it appears incremental as it combines existing methods like GANs and LLMs for a specific domain.

The paper tackled the problem of limited labeled malicious samples for training intrusion detection systems by introducing a semi-supervised framework that integrates GANs and LLMs to enhance malicious code generation and SQL injection detection in few-sample scenarios, resulting in effective dual enhancement capabilities.

Intrusion Detection Systems (IDS) play a crucial role in network security defense. However, a significant challenge for IDS in training detection models is the shortage of adequately labeled malicious samples. To address these issues, this paper introduces a novel semi-supervised framework \textbf{GANGRL-LLM}, which integrates Generative Adversarial Networks (GANs) with Large Language Models (LLMs) to enhance malicious code generation and SQL Injection (SQLi) detection capabilities in few-sample learning scenarios. Specifically, our framework adopts a collaborative training paradigm where: (1) the GAN-based discriminator improves malicious pattern recognition through adversarial learning with generated samples and limited real samples; and (2) the LLM-based generator refines the quality of malicious code synthesis using reward signals from the discriminator. The experimental results demonstrate that even with a limited number of labeled samples, our training framework is highly effective in enhancing both malicious code generation and detection capabilities. This dual enhancement capability offers a promising solution for developing adaptive defense systems capable of countering evolving cyber threats.

View on arXiv PDF

Similar