3 Papers

AIApr 19
Beyond Static Snapshots: A Grounded Evaluation Framework for Language Models at the Agentic Frontier

Jazmia Henry

We argue that current evaluation frameworks for large language models (LLMs) suffer from four systematic failures that make them structurally inadequate for assessing deployed, agentic systems: distributional invalidity (evaluation inputs do not reflect real interaction distributions), temporal invalidity (evaluations are post-hoc rather than training-integrated), scope invalidity (evaluations measure single-turn outputs rather than long-horizon trajectories), and process invalidity (evaluations assess outputs rather than reasoning). These failures compound critically in RLHF, where reward models are evaluated under conditions that do not hold during RL training, making reward hacking a predictable consequence of evaluation design rather than a training pathology. We propose the Grounded Continuous Evaluation (GCE) framework and present ISOPro, a simulation-based fine-tuning and evaluation system. ISOPro replaces the learned reward model with a deterministic ground-truth verifier, eliminating reward hacking by construction in verifiable-reward domains, and operates on LoRA adapter weights updatable on CPU, reducing the hardware barrier by an order of magnitude. We validate ISOPro on a resource-constrained scheduling domain with six difficulty tiers, demonstrating capability emergence visible only through continuous evaluation, an implicit curriculum that forms without researcher curation, and a 3x accuracy improvement over zero-shot baselines, all on consumer hardware with 0.216% trainable parameters.

AIFeb 8, 2024
Prompting Fairness: Artificial Intelligence as Game Players

Jazmia Henry

Utilitarian games such as dictator games to measure fairness have been studied in the social sciences for decades. These games have given us insight into not only how humans view fairness but also in what conditions the frequency of fairness, altruism and greed increase or decrease. While these games have traditionally been focused on humans, the rise of AI gives us the ability to study how these models play these games. AI is becoming a constant in human interaction and examining how these models portray fairness in game play can give us some insight into how AI makes decisions. Over 101 rounds of the dictator game, I conclude that AI has a strong sense of fairness that is dependant of it it deems the person it is playing with as trustworthy, framing has a strong effect on how much AI gives a recipient when designated the trustee, and there may be evidence that AI experiences inequality aversion just as humans.

LGJan 13, 2023
MLOps: A Primer for Policymakers on a New Frontier in Machine Learning

Jazmia Henry

This chapter is written with the Data Scientist or MLOps professional in mind but can be used as a resource for policy makers, reformists, AI Ethicists, sociologists, and others interested in finding methods that help reduce bias in algorithms. I will take a deployment centered approach with the assumption that the professionals reading this work have already read the amazing work on the implications of algorithms on historically marginalized groups by Gebru, Buolamwini, Benjamin and Shane to name a few. If you have not read those works, I refer you to the "Important Reading for Ethical Model Building" list at the end of this paper as it will help give you a framework on how to think about Machine Learning models more holistically taking into account their effect on marginalized people. In the Introduction to this chapter, I root the significance of their work in real world examples of what happens when models are deployed without transparent data collected for the training process and are deployed without the practitioners paying special attention to what happens to models that adapt to exploit gaps between their training environment and the real world. The rest of this chapter builds on the work of the aforementioned researchers and discusses the reality of models performing post production and details ways ML practitioners can identify bias using tools during the MLOps lifecycle to mitigate bias that may be introduced to models in the real world.