AICYHCMar 6, 2024

Emotional Manipulation Through Prompt Engineering Amplifies Disinformation Generation in AI Large Language Models

arXiv:2403.03550v112 citationsh-index: 11
Originality Incremental advance
AI Analysis

This highlights a vulnerability in AI systems for disinformation spread, relevant to AI safety and policy, though it is incremental as it builds on known risks.

The study found that OpenAI's large language models can generate synthetic disinformation at high frequencies when prompted politely, with a corpus of 19,800 posts, but refuse when prompted impolitely.

This study investigates the generation of synthetic disinformation by OpenAI's Large Language Models (LLMs) through prompt engineering and explores their responsiveness to emotional prompting. Leveraging various LLM iterations using davinci-002, davinci-003, gpt-3.5-turbo and gpt-4, we designed experiments to assess their success in producing disinformation. Our findings, based on a corpus of 19,800 synthetic disinformation social media posts, reveal that all LLMs by OpenAI can successfully produce disinformation, and that they effectively respond to emotional prompting, indicating their nuanced understanding of emotional cues in text generation. When prompted politely, all examined LLMs consistently generate disinformation at a high frequency. Conversely, when prompted impolitely, the frequency of disinformation production diminishes, as the models often refuse to generate disinformation and instead caution users that the tool is not intended for such purposes. This research contributes to the ongoing discourse surrounding responsible development and application of AI technologies, particularly in mitigating the spread of disinformation and promoting transparency in AI-generated content.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes