AJ Alvero

CY
h-index9
4papers
54citations
Novelty28%
AI Score38

4 Papers

HCMay 12
Does Algorithmic Uncertainty Sway Human Experts? Evidence from a Field Experiment in Selective College Admissions

Hansol Lee, AJ Alvero, René F. Kizilcec et al.

Algorithmic predictions are inherently uncertain: even models with similar aggregate accuracy can produce different predictions for the same individual, raising concerns that high-stakes decisions may become sensitive to arbitrary modeling choices. In this paper, we define \emph{algorithmic sensitivity} as the extent to which arbitrary modeling choices propagate into human decisions: how much a decision outcome shifts when a more favorable versus less favorable algorithmic prediction is presented to the decision-maker for the same individual. We estimate this in a randomized field experiment ($n=19{,}545$) embedded in a selective U.S. college admissions cycle, in which admissions officers reviewed each application alongside an algorithmic score while we randomly varied whether the score came from one of two similarly accurate prediction models. Although the two models performed similarly in aggregate, they frequently assigned different scores to the same applicant, creating exogenous variation in the score shown. Surprisingly, we find little evidence of algorithmic sensitivity: presenting a more favorable score does not meaningfully increase an applicant's probability of admission on average, even when the models disagree substantially. These findings suggest that, in this expert, high-stakes setting, human decision-making is largely invariant to arbitrary variation in algorithmic predictions, underscoring the role of professional discretion and institutional context in mediating the downstream effects of algorithmic uncertainty.

CLMar 25, 2025
Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays

Jinsook Lee, AJ Alvero, Thorsten Joachims et al.

People are increasingly using technologies equipped with large language models (LLM) to write texts for formal communication, which raises two important questions at the intersection of technology and society: Who do LLMs write like (model alignment); and can LLMs be prompted to change who they write like (model steerability). We investigate these questions in the high-stakes context of undergraduate admissions at a selective university by comparing lexical and sentence variation between essays written by 30,000 applicants to two types of LLM-generated essays: one prompted with only the essay question used by the human applicants; and another with additional demographic information about each applicant. We consistently find that both types of LLM-generated essays are linguistically distinct from human-authored essays, regardless of the specific model and analytical approach. Further, prompting a specific sociodemographic identity is remarkably ineffective in aligning the model with the linguistic patterns observed in human writing from this identity group. This holds along the key dimensions of sex, race, first-generation status, and geographic location. The demographically prompted and unprompted synthetic texts were also more similar to each other than to the human text, meaning that prompting did not alleviate homogenization. These issues of model alignment and steerability in current LLMs raise concerns about the use of LLMs in high-stakes contexts.

CYNov 21, 2025
Generative AI in Sociological Research: State of the Discipline

AJ Alvero, Dustin S. Stoltz, Oscar Stuhler et al.

Generative artificial intelligence (GenAI) has garnered considerable attention for its potential utility in research and scholarship. A growing body of work in sociology and related fields demonstrates both the potential advantages and risks of GenAI, but these studies are largely proof-of-concept or specific audits of models and products. We know comparatively little about how sociologists actually use GenAI in their research practices and how they view its present and future role in the discipline. In this paper, we describe the current landscape of GenAI use in sociological research based on a survey of authors in 50 sociology journals. Our sample includes both computational sociologists and non-computational sociologists and their collaborators. We find that sociologists primarily use GenAI to assist with writing tasks: revising, summarizing, editing, and translating their own work. Respondents report that GenAI saves time and that they are curious about its capabilities, but they do not currently feel strong institutional or field-level pressure to adopt it. Overall, respondents are wary of GenAI's social and environmental impacts and express low levels of trust in its outputs, but many believe that GenAI tools will improve over the next several years. We do not find large differences between computational and non-computational scholars in terms of GenAI use, attitudes, and concern; nor do we find strong patterns by familiarity or frequency of use. We discuss what these findings suggest about the future of GenAI in sociology and highlight challenges for developing shared norms around its use in research practice.

CYDec 17, 2019
AI and Holistic Review: Informing Human Reading in College Admissions

AJ Alvero, Noah Arthurs, anthony lising antonio et al.

College admissions in the United States is carried out by a human-centered method of evaluation known as holistic review, which typically involves reading original narrative essays submitted by each applicant. The legitimacy and fairness of holistic review, which gives human readers significant discretion over determining each applicant's fitness for admission, has been repeatedly challenged in courtrooms and the public sphere. Using a unique corpus of 283,676 application essays submitted to a large, selective, state university system between 2015 and 2016, we assess the extent to which applicant demographic characteristics can be inferred from application essays. We find a relatively interpretable classifier (logistic regression) was able to predict gender and household income with high levels of accuracy. Findings suggest that data auditing might be useful in informing holistic review, and perhaps other evaluative systems, by checking potential bias in human or computational readings.