Melanie Brucks

h-index32
2papers

2 Papers

CYFeb 23
Examining and Addressing Barriers to Diversity in LLM-Generated Ideas

Yuting Deng, Melanie Brucks, Olivier Toubia

Ideas generated by independent samples of humans tend to be more diverse than ideas generated from independent LLM samples, raising concerns that widespread reliance on LLMs could homogenize ideation and undermine innovation at a societal level. Drawing on cognitive psychology, we identify (both theoretically and empirically) two mechanisms undermining LLM idea diversity. First, at the individual level, LLMs exhibit fixation just as humans do, where early outputs constrain subsequent ideation. Second, at the collective level, LLMs aggregate knowledge into a unified distribution rather than exhibiting the knowledge partitioning inherent to human populations, where each person occupies a distinct region of the knowledge space. Through four studies, we demonstrate that targeted prompting interventions can address each mechanism independently: Chain-of-Thought (CoT) prompting reduces fixation by encouraging structured reasoning (only in LLMs, not humans), while ordinary personas (versus "creative entrepreneurs" such as Steve Jobs) improve knowledge partitioning by serving as diverse sampling cues, anchoring generation in distinct regions of the semantic space. Combining both approaches produces the highest idea diversity, outperforming humans. These findings offer a theoretically grounded framework for understanding LLM idea diversity and practical strategies for human-AI collaborations that leverage AI's efficiency without compromising the diversity essential to a healthy innovation ecosystem.

CYSep 23, 2025
A Mega-Study of Digital Twins Reveals Strengths, Weaknesses and Opportunities for Further Improvement

Tianyi Peng, George Gui, Daniel J. Merlau et al.

Digital representations of individuals ("digital twins") promise to transform social science and decision-making. Yet it remains unclear whether such twins truly mirror the people they emulate. We conducted 19 preregistered studies with a representative U.S. panel and their digital twins, each constructed from rich individual-level data, enabling direct comparisons between human and twin behavior across a wide range of domains and stimuli (including never-seen-before ones). Twins reproduced individual responses with 75% accuracy and seemingly low correlation with human answers (approximately 0.2). However, this apparently high accuracy was no higher than that achieved by generic personas based on demographics only. In contrast, correlation improved when twins incorporated detailed personal information, even outperforming traditional machine learning benchmarks that require additional data. Twins exhibited systematic strengths and weaknesses - performing better in social and personality domains, but worse in political ones - and were more accurate for participants with higher education, higher income, and moderate political views and religious attendance. Together, these findings delineate both the promise and the current limits of digital twins: they capture some relative differences among individuals but not yet the unique judgments of specific people. All data and code are publicly available to support the further development and evaluation of digital twin pipelines.