99.9HCMay 29
AI Behavioral ScienceMatthew O. Jackson, Qiaozhu Me, Stephanie W. Wang et al.
We outline a foundation for a new field of ``AI Behavioral Science,'' covering three perspectives. First, as AI becomes ubiquitous and is increasingly proprietary and opaque, it becomes vital to develop techniques for assessing AI behavior. We outline how tools developed to assess people's behaviors by social scientists can be used to assess and infer AI's behaviors biases, tendencies, and heuristics. Second, we also discuss how AI can change the ways in which we learn about human behavior. Beyond its computational power, AI offers new techniques for simulating, inferring, and predicting human behaviors that we outline and discuss. Third, as humans and AI are interacting in increasingly complex and intertwined systems, we need to understand the implications for the resulting economic and political outcomes. We outline issues that are increasingly pressing concerning the future of human-AI interactions and potential changes and disruptions that can ensue.
AINov 19, 2023
A Turing Test: Are AI Chatbots Behaviorally Similar to Humans?Qiaozhu Mei, Yutong Xie, Walter Yuan et al.
We administer a Turing Test to AI Chatbots. We examine how Chatbots behave in a suite of classic behavioral games that are designed to elicit characteristics such as trust, fairness, risk-aversion, cooperation, \textit{etc.}, as well as how they respond to a traditional Big-5 psychological survey that measures personality traits. ChatGPT-4 exhibits behavioral and personality traits that are statistically indistinguishable from a random human from tens of thousands of human subjects from more than 50 countries. Chatbots also modify their behavior based on previous experience and contexts ``as if'' they were learning from the interactions, and change their behavior in response to different framings of the same strategic situation. Their behaviors are often distinct from average and modal human behaviors, in which case they tend to behave on the more altruistic and cooperative end of the distribution. We estimate that they act as if they are maximizing an average of their own and partner's payoffs.
AIMay 29, 2025Code
Be.FM: Open Foundation Models for Human BehaviorYutong Xie, Zhuoheng Li, Xiyuan Wang et al.
Despite their success in numerous fields, the potential of foundation models for modeling and understanding human behavior remains largely unexplored. We introduce Be.FM, one of the first open foundation models designed for human behavior modeling. Built upon open-source large language models and fine-tuned on a diverse range of behavioral data, Be.FM can be used to understand and predict human decision-making. We construct a comprehensive set of benchmark tasks for testing the capabilities of behavioral foundation models. Our results demonstrate that Be.FM can predict behaviors, infer characteristics of individuals and populations, generate insights about contexts, and apply behavioral science knowledge.
AIDec 16, 2024
How Different AI Chatbots Behave? Benchmarking Large Language Models in Behavioral Economics GamesYutong Xie, Yiyao Liu, Zhuang Ma et al.
The deployment of large language models (LLMs) in diverse applications requires a thorough understanding of their decision-making strategies and behavioral patterns. As a supplement to a recent study on the behavioral Turing test, this paper presents a comprehensive analysis of five leading LLM-based chatbot families as they navigate a series of behavioral economics games. By benchmarking these AI chatbots, we aim to uncover and document both common and distinct behavioral patterns across a range of scenarios. The findings provide valuable insights into the strategic preferences of each LLM, highlighting potential implications for their deployment in critical decision-making roles.
AIMar 20, 2025
Using Large Language Models to Categorize Strategic Situations and Decipher Motivations Behind Human BehaviorsYutong Xie, Qiaozhu Mei, Walter Yuan et al.
By varying prompts to a large language model, we can elicit the full range of human behaviors in a variety of different scenarios in classic economic games. By analyzing which prompts elicit which behaviors, we can categorize and compare different strategic situations, which can also help provide insight into what different economic scenarios induce people to think about. We discuss how this provides a first step towards a non-standard method of inferring (deciphering) the motivations behind the human behaviors. We also show how this deciphering process can be used to categorize differences in the behavioral tendencies of different populations.
GNJun 26, 2017
Pricing and Referrals in Diffusion on NetworksMatt V. Leduc, Matthew O. Jackson, Ramesh Johari
When a new product or technology is introduced, potential consumers can learn its quality by trying the product, at a risk, or by letting others try it and free-riding on the information that they generate. We propose a dynamic game to study the adoption of technologies of uncertain value, when agents are connected by a network and a monopolist seller chooses a policy to maximize profits. Consumers with low degree (few friends) have incentives to adopt early, while consumers with high degree have incentives to free ride. The seller can induce high-degree consumers to adopt early by offering referral incentives - rewards to early adopters whose friends buy in the second period. Referral incentives thus lead to a `double-threshold strategy' by which low and high-degree agents adopt the product early while middle-degree agents wait. We show that referral incentives are optimal on certain networks while inter-temporal price discrimination (i.e., a first-period price discount) is optimal on others, and discuss welfare implications.