Dustin S. Stoltz

CY
3papers
48citations
Novelty15%
AI Score32

3 Papers

CLJan 16
Selecting Language Models for Social Science: Start Small, Start Open, and Validate

Dustin S. Stoltz, Marshall A. Taylor, Sanuj Kumar

Currently, there are thousands of large pretrained language models (LLMs) available to social scientists. How do we select among them? Using validity, reliability, reproducibility, and replicability as guides, we explore the significance of: (1) model openness, (2) model footprint, (3) training data, and (4) model architectures and fine-tuning. While ex-ante tests of validity (i.e., benchmarks) are often privileged in these discussions, we argue that social scientists cannot altogether avoid validating computational measures (ex-post). Replicability, in particular, is a more pressing guide for selecting language models. Being able to reliably replicate a particular finding that entails the use of a language model necessitates reliably reproducing a task. To this end, we propose starting with smaller, open models, and constructing delimited benchmarks to demonstrate the validity of the entire computational pipeline.

CYNov 21, 2025
Generative AI in Sociological Research: State of the Discipline

AJ Alvero, Dustin S. Stoltz, Oscar Stuhler et al.

Generative artificial intelligence (GenAI) has garnered considerable attention for its potential utility in research and scholarship. A growing body of work in sociology and related fields demonstrates both the potential advantages and risks of GenAI, but these studies are largely proof-of-concept or specific audits of models and products. We know comparatively little about how sociologists actually use GenAI in their research practices and how they view its present and future role in the discipline. In this paper, we describe the current landscape of GenAI use in sociological research based on a survey of authors in 50 sociology journals. Our sample includes both computational sociologists and non-computational sociologists and their collaborators. We find that sociologists primarily use GenAI to assist with writing tasks: revising, summarizing, editing, and translating their own work. Respondents report that GenAI saves time and that they are curious about its capabilities, but they do not currently feel strong institutional or field-level pressure to adopt it. Overall, respondents are wary of GenAI's social and environmental impacts and express low levels of trust in its outputs, but many believe that GenAI tools will improve over the next several years. We do not find large differences between computational and non-computational scholars in terms of GenAI use, attitudes, and concern; nor do we find strong patterns by familiarity or frequency of use. We discuss what these findings suggest about the future of GenAI in sociology and highlight challenges for developing shared norms around its use in research practice.

CYJul 9, 2020
Cultural Cartography with Word Embeddings

Dustin S. Stoltz, Marshall A. Taylor

Using the frequency of keywords is a classic approach in the formal analysis of text, but has the drawback of glossing over the relationality of word meanings. Word embedding models overcome this problem by constructing a standardized and continuous "meaning space" where words are assigned a location based on relations of similarity to other words based on how they are used in natural language samples. We show how word embeddings are commensurate with prevailing theories of meaning in sociology and can be put to the task of interpretation via two kinds of navigation. First, one can hold terms constant and measure how the embedding space moves around them--much like astronomers measured the changing of celestial bodies with the seasons. Second, one can also hold the embedding space constant and see how documents or authors move relative to it--just as ships use the stars on a given night to determine their location. Using the empirical case of immigration discourse in the United States, we demonstrate the merits of these two broad strategies for advancing important topics in cultural theory, including social marking, media fields, echo chambers, and cultural diffusion and change more broadly.