AI CL LGAug 7, 2023

AI Text-to-Behavior: A Study In Steerability

arXiv:2308.07326v113.912 citationsh-index: 13

Originality Synthesis-oriented

AI Analysis

This addresses the need for quantitative metrics to assess LLM steerability for researchers and developers, though it is incremental as it applies an existing framework to a new context.

The study tackled the problem of quantifying how well Large Language Models (LLMs) like ChatGPT can be steered to generate text aligned with specific behavioral traits, using the OCEAN framework, and found that traits like conscientiousness and neuroticism were distinctly evoked, while extroversion and agreeableness showed overlap but separation.

The research explores the steerability of Large Language Models (LLMs), particularly OpenAI's ChatGPT iterations. By employing a behavioral psychology framework called OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, Neuroticism), we quantitatively gauged the model's responsiveness to tailored prompts. When asked to generate text mimicking an extroverted personality, OCEAN scored the language alignment to that behavioral trait. In our analysis, while "openness" presented linguistic ambiguity, "conscientiousness" and "neuroticism" were distinctly evoked in the OCEAN framework, with "extroversion" and "agreeableness" showcasing a notable overlap yet distinct separation from other traits. Our findings underscore GPT's versatility and ability to discern and adapt to nuanced instructions. Furthermore, historical figure simulations highlighted the LLM's capacity to internalize and project instructible personas, precisely replicating their philosophies and dialogic styles. However, the rapid advancements in LLM capabilities and the opaque nature of some training techniques make metric proposals degrade rapidly. Our research emphasizes a quantitative role to describe steerability in LLMs, presenting both its promise and areas for further refinement in aligning its progress to human intentions.

View on arXiv PDF

Similar