CL CYNov 5, 2024

Growing a Tail: Increasing Output Diversity in Large Language Models

Michal Shur-Ofry, Bar Horowitz-Amsalem, Adir Rahamim, Yonatan Belinkov

arXiv:2411.02989v17.711 citationsh-index: 55

Originality Incremental advance

AI Analysis

This addresses the issue of preserving cultural diversity in AI outputs for policymakers and developers, though it is incremental as it builds on existing techniques.

The study tackled the problem of low output diversity in large language models compared to humans, finding that a combination of increased generation randomness, diverse prompting, and model aggregation significantly boosts diversity to human-like levels.

How diverse are the outputs of large language models when diversity is desired? We examine the diversity of responses of various models to questions with multiple possible answers, comparing them with human responses. Our findings suggest that models' outputs are highly concentrated, reflecting a narrow, mainstream 'worldview', in comparison to humans, whose responses exhibit a much longer-tail. We examine three ways to increase models' output diversity: 1) increasing generation randomness via temperature sampling; 2) prompting models to answer from diverse perspectives; 3) aggregating outputs from several models. A combination of these measures significantly increases models' output diversity, reaching that of humans. We discuss implications of these findings for AI policy that wishes to preserve cultural diversity, an essential building block of a democratic social fabric.

View on arXiv PDF

Similar