Evaluating LLM-based Personal Information Extraction and Countermeasures
This addresses security risks like spear phishing by evaluating and mitigating LLM-based personal information extraction, though it is incremental as it builds on existing LLM and countermeasure concepts.
The study benchmarked large language models (LLM) for extracting personal information from profiles, finding that LLMs outperform traditional methods, and introduced a prompt injection countermeasure that reduces attack effectiveness to traditional levels.
Automatically extracting personal information -- such as name, phone number, and email address -- from publicly available profiles at a large scale is a stepstone to many other security attacks including spear phishing. Traditional methods -- such as regular expression, keyword search, and entity detection -- achieve limited success at such personal information extraction. In this work, we perform a systematic measurement study to benchmark large language model (LLM) based personal information extraction and countermeasures. Towards this goal, we present a framework for LLM-based extraction attacks; collect four datasets including a synthetic dataset generated by GPT-4 and three real-world datasets with manually labeled eight categories of personal information; introduce a novel mitigation strategy based on prompt injection; and systematically benchmark LLM-based attacks and countermeasures using ten LLMs and five datasets. Our key findings include: LLM can be misused by attackers to accurately extract various personal information from personal profiles; LLM outperforms traditional methods; and prompt injection can defend against strong LLM-based attacks, reducing the attack to less effective traditional ones.