Sara Zannone

h-index5

5papers

194citations

Novelty34%

AI Score20

Ranked #184,555 of 194,257 authors (top 95%)#39,221 in LG (top 98%)

5 Papers

20.0AIJul 8

Alignment Plausibility: A New Standard for Assuring AI in Healthcare

Gwydion Williams, Sara Zannone, Bilal A Mateen

Large language models (LLMs) have become significant providers of mental health support, yet they remain products of an attention economy whose operational and commercial targets favour sustained engagement over the friction that effective psychological support often requires. Developers' safety responses have been largely reactive, addressing the most visible and acute harms while subtler, longer-term patterns of risk (e.g., dependency, boundary erosion, the amplification of distorted beliefs) receive less attention. We contend that making LLMs structurally safe requires alignment organised at three levels that mirror how society assures the safety of human clinical practice: 1) explicit value specification grounded in the codified normative commitments of clinical practice; 2) training that embeds those values in the model; and 3) oversight that detects drift and longer-term harm during deployment, much as clinical supervision does for human practice. Organising alignment in this way yields a construct we call alignment plausibility - a structured demonstration that a system's values, training regime, and oversight mechanisms are together consistent with safe and positive outcomes. We propose alignment plausibility as a regulatory construct (by drawing analogy to the established construct of biological plausibility) for AI in health: a principled way to argue for, or against, trust that systems are aligned to positive health outcomes, will cause no harm even where capable of doing so, and will ultimately lead to patient benefit.

9.8CVFeb 22, 2023

Uncovering Bias in Face Generation Models

Cristian Muñoz, Sara Zannone, Umar Mohammed et al.

Recent advancements in GANs and diffusion models have enabled the creation of high-resolution, hyper-realistic images. However, these models may misrepresent certain social groups and present bias. Understanding bias in these models remains an important research question, especially for tasks that support critical decision-making and could affect minorities. The contribution of this work is a novel analysis covering architectures and embedding spaces for fine-grained understanding of bias over three approaches: generators, attribute modifier, and post-processing bias mitigators. This work shows that generators suffer from bias across all social groups with attribute preferences such as between 75%-85% for whiteness and 60%-80% for the female gender (for all trained CelebA models) and low probabilities of generating children and older men. Modifier and mitigators work as post-processor and change the generator performance. For instance, attribute channel perturbation strategies modify the embedding spaces. We quantify the influence of this change on group fairness by measuring the impact on image quality and group features. Specifically, we use the Fréchet Inception Distance (FID), the Face Matching Error and the Self-Similarity score. For Interfacegan, we analyze one and two attribute channel perturbations and examine the effect on the fairness distribution and the quality of the image. Finally, we analyzed the post-processing bias mitigators, which are the fastest and most computationally efficient way to mitigate bias. We find that these mitigation techniques show similar results on KL divergence and FID score, however, self-similarity scores show a different feature concentration on the new groups of the data distribution. The weaknesses and ongoing challenges described in this work must be considered in the pursuit of creating fair and unbiased face generation models.

6.6LGFeb 8, 2023

Local Law 144: A Critical Analysis of Regression Metrics

Giulio Filippi, Sara Zannone, Airlie Hilliard et al.

The use of automated decision tools in recruitment has received an increasing amount of attention. In November 2021, the New York City Council passed a legislation (Local Law 144) that mandates bias audits of Automated Employment Decision Tools. From 15th April 2023, companies that use automated tools for hiring or promoting employees are required to have these systems audited by an independent entity. Auditors are asked to compute bias metrics that compare outcomes for different groups, based on sex/gender and race/ethnicity categories at a minimum. Local Law 144 proposes novel bias metrics for regression tasks (scenarios where the automated system scores candidates with a continuous range of values). A previous version of the legislation proposed a bias metric that compared the mean scores of different groups. The new revised bias metric compares the proportion of candidates in each group that falls above the median. In this paper, we argue that both metrics fail to capture distributional differences over the whole domain, and therefore cannot reliably detect bias. We first introduce two metrics, as possible alternatives to the legislation metrics. We then compare these metrics over a range of theoretical examples, for which the legislation proposed metrics seem to underestimate bias. Finally, we study real data and show that the legislation metrics can similarly fail in a real-world recruitment application.

2.0LGFeb 24, 2023

Intersectional Fairness: A Fractal Approach

Giulio Filippi, Sara Zannone, Adriano Koshiyama

The issue of fairness in AI has received an increasing amount of attention in recent years. The problem can be approached by looking at different protected attributes (e.g., ethnicity, gender, etc) independently, but fairness for individual protected attributes does not imply intersectional fairness. In this work, we frame the problem of intersectional fairness within a geometrical setting. We project our data onto a hypercube, and split the analysis of fairness by levels, where each level encodes the number of protected attributes we are intersecting over. We prove mathematically that, while fairness does not propagate "down" the levels, it does propagate "up" the levels. This means that ensuring fairness for all subgroups at the lowest intersectional level (e.g., black women, white women, black men and white men), will necessarily result in fairness for all the above levels, including each of the protected attributes (e.g., ethnicity and gender) taken independently. We also derive a formula describing the variance of the set of estimated success rates on each level, under the assumption of perfect fairness. Using this theoretical finding as a benchmark, we define a family of metrics which capture overall intersectional bias. Finally, we propose that fairness can be metaphorically thought of as a "fractal" problem. In fractals, patterns at the smallest scale repeat at a larger scale. We see from this example that tackling the problem at the lowest possible level, in a bottom-up manner, leads to the natural emergence of fair AI. We suggest that trustworthiness is necessarily an emergent, fractal and relational property of the AI system.

14.2HCAug 13, 2019

Modeling Personality vs. Modeling Personalidad: In-the-wild Mobile Data Analysis in Five Countries Suggests Cultural Impact on Personality Models

Mohammed Khwaja, Sumer S. Vaid, Sara Zannone et al.

Sensor data collected from smartphones provides the possibility to passively infer a user's personality traits. Such models can be used to enable technology personalization, while contributing to our substantive understanding of how human behavior manifests in daily life. A significant challenge in personality modeling involves improving the accuracy of personality inferences, however, research has yet to assess and consider the cultural impact of users' country of residence on model replicability. We collected mobile sensing data and self-reported Big Five traits from 166 participants (54 women and 112 men) recruited in five different countries (UK, Spain, Colombia, Peru, and Chile) for 3 weeks. We developed machine learning based personality models using culturally diverse datasets -- representing different countries -- and we show that such models can achieve state-of-the-art accuracy when tested in new countries, ranging from 63% (Agreeableness) to 71% (Extraversion) of classification accuracy. Our results indicate that using country-specific datasets can improve the classification accuracy between 3% and 7% for Extraversion, Agreeableness, and Conscientiousness. We show that these findings hold regardless of gender and age balance in the dataset. Interestingly, using gender- or age- balanced datasets as well as gender-separated datasets improve trait prediction by up to 17%. We unpack differences in personality models across the five countries, highlight the most predictive data categories (location, noise, unlocks, accelerometer), and provide takeaways to technologists and social scientists interested in passive personality assessment.