Position: It's Time to Act on the Risk of Efficient Personalized Text Generation
This problem is significant for individuals and organizations vulnerable to impersonation attacks, particularly as the technology is accessible to private individuals and can be used for malicious purposes.
The authors tackle the problem of efficient personalized text generation, which can be used to create high-quality models that imitate an individual's writing style, resulting in potential safety risks such as phishing emails or fraudulent social media accounts. This technology can be created and run cheaply on consumer-grade hardware.
The recent surge in high-quality open-source Generative AI text models (colloquially: LLMs), as well as efficient finetuning techniques, have opened the possibility of creating high-quality personalized models that generate text attuned to a specific individual's needs and are capable of credibly imitating their writing style by refining an open-source model with that person's own data. The technology to create such models is accessible to private individuals, and training and running such models can be done cheaply on consumer-grade hardware. While these advancements are a huge gain for usability and privacy, this position paper argues that the practical feasibility of impersonating specific individuals also introduces novel safety risks. For instance, this technology enables the creation of phishing emails or fraudulent social media accounts, based on small amounts of publicly available text, or by the individuals themselves to escape AI text detection. We further argue that these risks are complementary to - and distinct from - the much-discussed risks of other impersonation attacks such as image, voice, or video deepfakes, and are not adequately addressed by the larger research community, or the current generation of open- and closed-source models.