AI CRAug 15, 2022

Targeted Honeyword Generation with Language Models

arXiv:2208.06946v26.25 citationsh-index: 17

Originality Incremental advance

AI Analysis

This addresses security vulnerabilities in authentication systems for users and organizations by making honeyword generation more robust against PII-based attacks, though it is incremental as it builds on existing honeyword techniques.

The paper tackles the problem of generating honeywords (fake passwords) that are indistinguishable from real passwords, especially when attackers use personally identifiable information (PII), by employing pre-trained language models without training on real passwords, and results show it is extremely difficult for users to distinguish real passwords from honeywords in a pilot experiment.

Honeywords are fictitious passwords inserted into databases in order to identify password breaches. The major difficulty is how to produce honeywords that are difficult to distinguish from real passwords. Although the generation of honeywords has been widely investigated in the past, the majority of existing research assumes attackers have no knowledge of the users. These honeyword generating techniques (HGTs) may utterly fail if attackers exploit users' personally identifiable information (PII) and the real passwords include users' PII. In this paper, we propose to build a more secure and trustworthy authentication system that employs off-the-shelf pre-trained language models which require no further training on real passwords to produce honeywords while retaining the PII of the associated real password, therefore significantly raising the bar for attackers. We conducted a pilot experiment in which individuals are asked to distinguish between authentic passwords and honeywords when the username is provided for GPT-3 and a tweaking technique. Results show that it is extremely difficult to distinguish the real passwords from the artifical ones for both techniques. We speculate that a larger sample size could reveal a significant difference between the two HGT techniques, favouring our proposed approach.

View on arXiv PDF

Similar