CR AIAug 12, 2024

Audit-LLM: Multi-Agent Collaboration for Log-based Insider Threat Detection

Chengyu Song, Linru Ma, Jianming Zheng, Jinzhi Liao, Hongyu Kuang, Lin Yang

arXiv:2408.08902v117.227 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses insider threat detection for cybersecurity applications, presenting an incremental improvement through a novel multi-agent approach.

The paper tackles the challenge of log-based insider threat detection by addressing LLMs' limitations with diverse activity types, overlong log files, and faithfulness hallucination, introducing Audit-LLM with a multi-agent framework and evidence-based debate mechanism that achieves superior performance on three datasets.

Log-based insider threat detection (ITD) detects malicious user activities by auditing log entries. Recently, large language models (LLMs) with strong common sense knowledge have emerged in the domain of ITD. Nevertheless, diverse activity types and overlong log files pose a significant challenge for LLMs in directly discerning malicious ones within myriads of normal activities. Furthermore, the faithfulness hallucination issue from LLMs aggravates its application difficulty in ITD, as the generated conclusion may not align with user commands and activity context. In response to these challenges, we introduce Audit-LLM, a multi-agent log-based insider threat detection framework comprising three collaborative agents: (i) the Decomposer agent, breaking down the complex ITD task into manageable sub-tasks using Chain-of-Thought (COT) reasoning;(ii) the Tool Builder agent, creating reusable tools for sub-tasks to overcome context length limitations in LLMs; and (iii) the Executor agent, generating the final detection conclusion by invoking constructed tools. To enhance conclusion accuracy, we propose a pair-wise Evidence-based Multi-agent Debate (EMAD) mechanism, where two independent Executors iteratively refine their conclusions through reasoning exchange to reach a consensus. Comprehensive experiments conducted on three publicly available ITD datasets-CERT r4.2, CERT r5.2, and PicoDomain-demonstrate the superiority of our method over existing baselines and show that the proposed EMAD significantly improves the faithfulness of explanations generated by LLMs.

View on arXiv PDF

Similar