Prompt-Based Learning for Thread Structure Prediction in Cybersecurity Forums
This work addresses the need for better analysis of hacker forums to combat cyber threats, representing an incremental advance by applying instructional prompting to a new domain.
The paper tackles the problem of predicting thread structures in cybersecurity forums to identify skilled users and improve threat prediction, proposing a prompt-based learning method that significantly outperforms existing methods on Reddit and Hacker Forums datasets.
With recent trends indicating cyber crimes increasing in both frequency and cost, it is imperative to develop new methods that leverage data-rich hacker forums to assist in combating ever evolving cyber threats. Defining interactions within these forums is critical as it facilitates identifying highly skilled users, which can improve prediction of novel threats and future cyber attacks. We propose a method called Next Paragraph Prediction with Instructional Prompting (NPP-IP) to predict thread structures while grounded on the context around posts. This is the first time to apply an instructional prompting approach to the cybersecurity domain. We evaluate our NPP-IP with the Reddit dataset and Hacker Forums dataset that has posts and thread structures of real hacker forums' threads, and compare our method's performance with existing methods. The experimental evaluation shows that our proposed method can predict the thread structure significantly better than existing methods allowing for better social network prediction based on forum interactions.