Randomized Controlled Trials for Phishing Triage Agent
This addresses the problem of high-volume phishing email triage for security operations centers, offering actionable insights for SOC leaders, though it is incremental as it applies an existing RCT method to a new domain-specific AI agent.
The paper tackled the challenge of efficiently triaging phishing emails in security operations centers by conducting a randomized controlled trial of an AI agent, showing that agent-augmented analysts achieved up to 6.5 times more true positives per minute and a 77% improvement in verdict accuracy compared to a control group.
Security operations centers (SOCs) face a persistent challenge: efficiently triaging a high volume of user-reported phishing emails while maintaining robust protection against threats. This paper presents the first randomized controlled trial (RCT) evaluating the impact of a domain-specific AI agent - the Microsoft Security Copilot Phishing Triage Agent - on analyst productivity and accuracy. Our results demonstrate that agent-augmented analysts achieved up to 6.5 times as many true positives per analyst minute and a 77% improvement in verdict accuracy compared to a control group. The agent's queue prioritization and verdict explanations were both significant drivers of efficiency. Behavioral analysis revealed that agent-augmented analysts reallocated their attention, spending 53% more time on malicious emails, and were not prone to rubber-stamping the agent's malicious verdicts. These findings offer actionable insights for SOC leaders considering AI adoption, including the potential for agents to fundamentally change the optimal allocation of SOC resources.