Helen Nissenbaum

CY
h-index8
10papers
210citations
Novelty33%
AI Score39

10 Papers

LGApr 21
Reinforcing privacy reasoning in LLMs via normative simulacra from fiction

Matt Franchi, Madiha Zahrah Choksi, Harold Triedman et al.

Information handling practices of LLM agents are broadly misaligned with the contextual privacy expectations of their users. Contextual Integrity (CI) provides a principled framework, defining privacy as the appropriate flow of information within context-relative norms. However, existing approaches either double inference cost via supervisor-assistant architectures, or fine-tune on narrow task-specific data. We propose extracting normative simulacra (structured representations of norms and information flows) from fiction novels and using them to fine-tune LLMs via supervised learning followed by GRPO reinforcement learning. Our composite reward function combines programmatic signals, including task clarity (subsuming schema validity, construct discrimination, and extraction confidence), structural completeness, internal consistency, and context identification, with an LLM judge that evaluates whether the model's privacy reasoning is grounded in the held-out normative universe of the source text. To mitigate overfitting, we introduce per-completion contrastive scoring: each completion is evaluated against both the correct normative universe and a randomly selected wrong one, teaching the model to condition on context rather than memorize source-specific norms. We evaluate on five CI-aligned benchmarks spanning distinct societal contexts and ablate the contributions of RL and normative grounding. Across seven models, SFT introduces a conservative prior toward restricting information flow, improving recognition of privacy-relevant situations but not the correctness of privacy judgments. GRPO with normative grounding achieves the highest score on a law compliance benchmark and strongest correlation with crowdsourced human privacy expectations, demonstrating that fiction-derived normative simulacra can teach contextual privacy reasoning that transfers to real-world domains.

CYOct 5, 2023
Strategic Evaluation: Subjects, Evaluators, and Society

Benjamin Laufer, Jon Kleinberg, Karen Levy et al.

A broad current application of algorithms is in formal and quantitative measures of murky concepts -- like merit -- to make decisions. When people strategically respond to these sorts of evaluations in order to gain favorable decision outcomes, their behavior can be subjected to moral judgments. They may be described as 'gaming the system' or 'cheating,' or (in other cases) investing 'honest effort' or 'improving.' Machine learning literature on strategic behavior has tried to describe these dynamics by emphasizing the efforts expended by decision subjects hoping to obtain a more favorable assessment -- some works offer ways to preempt or prevent such manipulations, some differentiate 'gaming' from 'improvement' behavior, while others aim to measure the effort burden or disparate effects of classification systems. We begin from a different starting point: that the design of an evaluation itself can be understood as furthering goals held by the evaluator which may be misaligned with broader societal goals. To develop the idea that evaluation represents a strategic interaction in which both the evaluator and the subject of their evaluation are operating out of self-interest, we put forward a model that represents the process of evaluation using three interacting agents: a decision subject, an evaluator, and society, representing a bundle of values and oversight mechanisms. We highlight our model's applicability to a number of social systems where one or two players strategically undermine the others' interests to advance their own. Treating evaluators as themselves strategic allows us to re-cast the scrutiny directed at decision subjects, towards the incentives that underpin institutional designs of evaluations. The moral standing of strategic behaviors often depend on the moral standing of the evaluations and incentives that provoke such behaviors.

AIJun 15, 2025
Rethinking Optimization: A Systems-Based Approach to Social Externalities

Pegah Nokhiz, Aravinda Kanchana Ruwanpathirana, Helen Nissenbaum

Optimization is widely used for decision making across various domains, valued for its ability to improve efficiency. However, poor implementation practices can lead to unintended consequences, particularly in socioeconomic contexts where externalities (costs or benefits to third parties outside the optimization process) are significant. To propose solutions, it is crucial to first characterize involved stakeholders, their goals, and the types of subpar practices causing unforeseen outcomes. This task is complex because affected stakeholders often fall outside the direct focus of optimization processes. Also, incorporating these externalities into optimization requires going beyond traditional economic frameworks, which often focus on describing externalities but fail to address their normative implications or interconnected nature, and feedback loops. This paper suggests a framework that combines systems thinking with the economic concept of externalities to tackle these challenges. This approach aims to characterize what went wrong, who was affected, and how (or where) to include them in the optimization process. Economic externalities, along with their established quantification methods, assist in identifying "who was affected and how" through stakeholder characterization. Meanwhile, systems thinking (an analytical approach to comprehending relationships in complex systems) provides a holistic, normative perspective. Systems thinking contributes to an understanding of interconnections among externalities, feedback loops, and determining "when" to incorporate them in the optimization. Together, these approaches create a comprehensive framework for addressing optimization's unintended consequences, balancing descriptive accuracy with normative objectives. Using this, we examine three common types of subpar practices: ignorance, error, and prioritization of short-term goals.

CYMay 11, 2025
Privacy of Groups in Dense Street Imagery

Matt Franchi, Hauke Sandhaus, Madiha Zahrah Choksi et al.

Spatially and temporally dense street imagery (DSI) datasets have grown unbounded. In 2024, individual companies possessed around 3 trillion unique images of public streets. DSI data streams are only set to grow as companies like Lyft and Waymo use DSI to train autonomous vehicle algorithms and analyze collisions. Academic researchers leverage DSI to explore novel approaches to urban analysis. Despite good-faith efforts by DSI providers to protect individual privacy through blurring faces and license plates, these measures fail to address broader privacy concerns. In this work, we find that increased data density and advancements in artificial intelligence enable harmful group membership inferences from supposedly anonymized data. We perform a penetration test to demonstrate how easily sensitive group affiliations can be inferred from obfuscated pedestrians in 25,232,608 dashcam images taken in New York City. We develop a typology of identifiable groups within DSI and analyze privacy implications through the lens of contextual integrity. Finally, we discuss actionable recommendations for researchers working with data from DSI providers.

AIMay 27, 2023
Optimization's Neglected Normative Commitments

Benjamin Laufer, Thomas Krendl Gilbert, Helen Nissenbaum

Optimization is offered as an objective approach to resolving complex, real-world decisions involving uncertainty and conflicting interests. It drives business strategies as well as public policies and, increasingly, lies at the heart of sophisticated machine learning systems. A paradigm used to approach potentially high-stakes decisions, optimization relies on abstracting the real world to a set of decision(s), objective(s) and constraint(s). Drawing from the modeling process and a range of actual cases, this paper describes the normative choices and assumptions that are necessarily part of using optimization. It then identifies six emergent problems that may be neglected: 1) Misspecified values can yield optimizations that omit certain imperatives altogether or incorporate them incorrectly as a constraint or as part of the objective, 2) Problematic decision boundaries can lead to faulty modularity assumptions and feedback loops, 3) Failing to account for multiple agents' divergent goals and decisions can lead to policies that serve only certain narrow interests, 4) Mislabeling and mismeasurement can introduce bias and imprecision, 5) Faulty use of relaxation and approximation methods, unaccompanied by formal characterizations and guarantees, can severely impede applicability, and 6) Treating optimization as a justification for action, without specifying the necessary contextual information, can lead to ethically dubious or faulty decisions. Suggestions are given to further understand and curb the harms that can arise when optimization is used wrongfully.

CYFeb 10, 2022
Accountability in an Algorithmic Society: Relationality, Responsibility, and Robustness in Machine Learning

A. Feder Cooper, Emanuel Moss, Benjamin Laufer et al.

In 1996, Accountability in a Computerized Society [95] issued a clarion call concerning the erosion of accountability in society due to the ubiquitous delegation of consequential functions to computerized systems. Nissenbaum [95] described four barriers to accountability that computerization presented, which we revisit in relation to the ascendance of data-driven algorithmic systems--i.e., machine learning or artificial intelligence--to uncover new challenges for accountability that these systems present. Nissenbaum's original paper grounded discussion of the barriers in moral philosophy; we bring this analysis together with recent scholarship on relational accountability frameworks and discuss how the barriers present difficulties for instantiating a unified moral, relational framework in practice for data-driven algorithmic systems. We conclude by discussing ways of weakening the barriers in order to do so.

CYMay 26, 2021
Computer Vision and Conflicting Values: Describing People with Automated Alt Text

Margot Hanley, Solon Barocas, Karen Levy et al.

Scholars have recently drawn attention to a range of controversial issues posed by the use of computer vision for automatically generating descriptions of people in images. Despite these concerns, automated image description has become an important tool to ensure equitable access to information for blind and low vision people. In this paper, we investigate the ethical dilemmas faced by companies that have adopted the use of computer vision for producing alt text: textual descriptions of images for blind and low vision people, We use Facebook's automatic alt text tool as our primary case study. First, we analyze the policies that Facebook has adopted with respect to identity categories, such as race, gender, age, etc., and the company's decisions about whether to present these terms in alt text. We then describe an alternative -- and manual -- approach practiced in the museum community, focusing on how museums determine what to include in alt text descriptions of cultural artifacts. We compare these policies, using notable points of contrast to develop an analytic framework that characterizes the particular apprehensions behind these policy choices. We conclude by considering two strategies that seem to sidestep some of these concerns, finding that there are no easy ways to avoid the normative dilemmas posed by the use of computer vision to automate alt text.

CYDec 11, 2020
Interdisciplinary Approaches to Understanding Artificial Intelligence's Impact on Society

Suresh Venkatasubramanian, Nadya Bliss, Helen Nissenbaum et al.

Innovations in AI have focused primarily on the questions of "what" and "how"-algorithms for finding patterns in web searches, for instance-without adequate attention to the possible harms (such as privacy, bias, or manipulation) and without adequate consideration of the societal context in which these systems operate. In part, this is driven by incentives and forces in the tech industry, where a more product-driven focus tends to drown out broader reflective concerns about potential harms and misframings. But this focus on what and how is largely a reflection of the engineering and mathematics-focused training in computer science, which emphasizes the building of tools and development of computational concepts. As a result of this tight technical focus, and the rapid, worldwide explosion in its use, AI has come with a storm of unanticipated socio-technical problems, ranging from algorithms that act in racially or gender-biased ways, get caught in feedback loops that perpetuate inequalities, or enable unprecedented behavioral monitoring surveillance that challenges the fundamental values of free, democratic societies. Given that AI is no longer solely the domain of technologists but rather of society as a whole, we need tighter coupling of computer science and those disciplines that study society and societal values.

CYNov 27, 2020
An Ethical Highlighter for People-Centric Dataset Creation

Margot Hanley, Apoorv Khandelwal, Hadar Averbuch-Elor et al.

Important ethical concerns arising from computer vision datasets of people have been receiving significant attention, and a number of datasets have been withdrawn as a result. To meet the academic need for people-centric datasets, we propose an analytical framework to guide ethical evaluation of existing datasets and to serve future dataset creators in avoiding missteps. Our work is informed by a review and analysis of prior works and highlights where such ethical challenges arise.

CRNov 7, 2017
The VACCINE Framework for Building DLP Systems

Yan Shvartzshnaider, Zvonimir Pavlinovic, Thomas Wies et al.

Conventional Data Leakage Prevention (DLP) systems suffer from the following major drawback: Privacy policies that define what constitutes data leakage cannot be seamlessly defined and enforced across heterogeneous forms of communication. Administrators have the dual burden of: (1) manually self-interpreting policies from handbooks to specify rules (which is error-prone); (2) extracting relevant information flows from heterogeneous communication protocols and enforcing policies to determine which flows should be admissible. To address these issues, we present the Verifiable and ACtionable Contextual Integrity Norms Engine (VACCINE), a framework for building adaptable and modular DLP systems. VACCINE relies on (1) the theory of contextual integrity to provide an abstraction layer suitable for specifying reusable protocol-agnostic leakage prevention rules and (2) programming language techniques to check these rules against correctness properties and to enforce them faithfully within a DLP system implementation. We applied VACCINE to the Family Educational Rights and Privacy Act and Enron Corporation privacy regulations. We show that by using contextual integrity in conjunction with verification techniques, we can effectively create reusable privacy rules with specific correctness guarantees, and check the integrity of information flows against these rules. Our experiments in emulated enterprise settings indicate that VACCINE improves over current DLP system design approaches and can be deployed in enterprises involving tens of thousands of actors.