CYMar 13
Examining Risks in the AI Companion Application EcosystemNatalie Grace Brigham, Lucy Qin, Tadayoshi Kohno
While computer systems that allow users to interact through conversational natural language (i.e., chatbots) have existed for many years, varying types of applications advertising AI companionship (e.g., Character AI, Replika) have proliferated in recent years due to advancements in large language models. Our work offers a threat model encompassing two distinct risk categories: harms posed to users by AI companion applications, and harms enabled by malicious users exploiting application features. To further understand this application ecosystem, we identified 489 unique apps from the App Store and Play Store that advertised AI companionship. We then systematically conducted and analyzed walkthroughs of a stratified sample of 30 apps with respect to our threat model. Through our analysis, we categorize broader ecosystem trends that provide context for understanding threats and identify specific threats related to sensitive data collection and sharing, anthropomorphism, engagement mechanisms, sexual interactions and media, as well as the ingestion and reconstruction of likeness, including the potential for generating synthetic nonconsensual intimate imagery. This study provides a foundational security perspective on the AI companion application ecosystem and informs future research within and beyond this field, policy, and technical development. Content warning: This paper includes descriptions of applications that can be used to create synthetic nonconsensual representations, including explicit imagery, as well as discussion of self-harm and suicidal ideation.
CYJan 28
"Unlimited Realm of Exploration and Experimentation": Methods and Motivations of AI-Generated Sexual Content CreatorsJaron Mink, Lucy Qin, Elissa M. Redmiles
AI-generated media is radically changing the way content is both consumed and produced on the internet, and in no place is this potentially more visible than in sexual content. AI-generated sexual content (AIG-SC) is increasingly enabled by an ecosystem of individual AI developers, specialized third-party applications, and foundation model providers. AIG-SC raises a number of concerns from old debates about the line between pornography and obscenity, to newer debates about fair use and labor displacement (in this case, of sex workers), and spurred new regulations to curb the spread of non-consensual intimate imagery (NCII) created using the same technology used to create AIG-SC. However, despite the growing prevalence of AIG-SC, little is known about its creators, their motivations, and what types of content they produce. To inform effective governance in this space, we perform an in-depth study to understand what AIG-SC creators make, along with how and why they make it. Interviews of 28 AIG-SC creators, ranging from hobbyists to entrepreneurs to those who moderate communities of hundreds of thousands of other creators, reveal a wide spectrum of motivations, including sexual exploration, creative expression, technical experimentation, and in a handful of cases, the creation of NCII.
CRFeb 9, 2022
Outside Looking In: Approaches to Content Moderation in End-to-End Encrypted SystemsSeny Kamara, Mallory Knodel, Emma Llansó et al.
In this paper, we assess existing technical proposals for content moderation in End-to-End Encryption (E2EE) services. First, we explain the various tools in the content moderation toolbox, how they are used, and the different phases of the moderation cycle, including detection of unwanted content. We then lay out a definition of encryption and E2EE, which includes privacy and security guarantees for end-users, before assessing current technical proposals for the detection of unwanted content in E2EE services against those guarantees. We find that technical approaches for user-reporting and meta-data analysis are the most likely to preserve privacy and security guarantees for end-users. Both provide effective tools that can detect significant amounts of different types of problematic content on E2EE services, including abusive and harassing messages, spam, mis- and disinformation, and CSAM, although more research is required to improve these tools and better measure their effectiveness. Conversely, we find that other techniques that purport to facilitate content detection in E2EE systems have the effect of undermining key security guarantees of E2EE systems.