AI Alignment Amplifies the Role of Race, Gender, and Disability in Hiring Decisions

arXiv:2605.1386656.7

Predicted impact top 24% in CY · last 90 daysOriginality Highly original

AI Analysis

For AI developers and policymakers, this reveals that alignment can systematically reshape discrimination patterns in AI hiring tools, reversing racial discrimination direction and amplifying gender biases.

Language models, after alignment, amplify hiring advantages for female and Black candidates by 325% and 330% respectively, and disadvantages for disabled candidates by 171%, compared to pre-trained models. These demographic effects are comparable to six months to one year of additional education.

Humans increasingly delegate decisions to language models, yet whether these systems reproduce or reshape human patterns of discrimination remains unclear. Here we run a large-scale study to analyse whether language models use demographic information in hiring decisions. We show, across 27 models and 177 occupations, that language models give female and Black candidates hiring advantages relative to otherwise-comparable male and white candidates, while giving disabled candidates disadvantages. The differences are meaningful in magnitude: the role of race, gender, and disability status is comparable to six months to one year of additional education. Post-training alignment is the primary driver: relative to matched pre-trained models, alignment amplifies advantages for female and Black candidates by 325% and 330%, and disadvantages for disabled candidates by 171%. Compared with previous human correspondence studies, language models reverse the direction of racial discrimination, attenuate the disability penalty, and amplify the female advantage by 190%. Alignment changes how models use qualification signals: alignment increases returns to skills and work experience overall, but relatively more so for female and Black candidates. Meanwhile, the absence of qualification signals harms marginalised groups more, particularly for disabled candidates, differences that may explain the asymmetry of alignment effects across groups we observe.

View on arXiv PDF

Similar