Evaluating the Usability of LLMs in Threat Intelligence Enrichment
This work addresses usability concerns for security professionals adopting LLMs in threat intelligence, but it is incremental as it focuses on evaluation rather than novel methods.
This study tackled the problem of evaluating the usability of LLMs in threat intelligence enrichment by assessing five models on factors like user interface and error handling, resulting in identified key issues and actionable recommendations to improve user-friendliness and reliability.
Large Language Models (LLMs) have the potential to significantly enhance threat intelligence by automating the collection, preprocessing, and analysis of threat data. However, the usability of these tools is critical to ensure their effective adoption by security professionals. Despite the advanced capabilities of LLMs, concerns about their reliability, accuracy, and potential for generating inaccurate information persist. This study conducts a comprehensive usability evaluation of five LLMs ChatGPT, Gemini, Cohere, Copilot, and Meta AI focusing on their user interface design, error handling, learning curve, performance, and integration with existing tools in threat intelligence enrichment. Utilizing a heuristic walkthrough and a user study methodology, we identify key usability issues and offer actionable recommendations for improvement. Our findings aim to bridge the gap between LLM functionality and user experience, thereby promoting more efficient and accurate threat intelligence practices by ensuring these tools are user-friendly and reliable.