David Pujol

DB
4papers
24citations
Novelty51%
AI Score29

4 Papers

CRMay 2, 2025
Slowly Scaling Per-Record Differential Privacy

Brian Finley, Anthony M Caruso, Justin C Doty et al.

We develop formal privacy mechanisms for releasing statistics from data with many outlying values, such as income data. These mechanisms ensure that a per-record differential privacy guarantee degrades slowly in the protected records' influence on the statistics being released. Formal privacy mechanisms generally add randomness, or "noise," to published statistics. If a noisy statistic's distribution changes little with the addition or deletion of a single record in the underlying dataset, an attacker looking at this statistic will find it plausible that any particular record was present or absent, preserving the records' privacy. More influential records -- those whose addition or deletion would change the statistics' distribution more -- typically suffer greater privacy loss. The per-record differential privacy framework quantifies these record-specific privacy guarantees, but existing mechanisms let these guarantees degrade rapidly (linearly or quadratically) with influence. While this may be acceptable in cases with some moderately influential records, it results in unacceptably high privacy losses when records' influence varies widely, as is common in economic data. We develop mechanisms with privacy guarantees that instead degrade as slowly as logarithmically with influence. These mechanisms allow for the accurate, unbiased release of statistics, while providing meaningful protection for highly influential records. As an example, we consider the private release of sums of unbounded establishment data such as payroll, where our mechanisms extend meaningful privacy protection even to very large establishments. We evaluate these mechanisms empirically and demonstrate their utility.

CYNov 8, 2021
Equity and Privacy: More Than Just a Tradeoff

David Pujol, Ashwin Machanavajjhala

While the entire field of privacy preserving data analytics is focused on the privacy-utility tradeoff, recent work has shown that privacy preserving data publishing can introduce different levels of utility across different population groups. It is important to understand this new tradeoff between privacy and equity as privacy technology is being deployed in situations where the data products will be used for research and policy making. Will marginal populations see disproportionately less utility from privacy technology? If there is an inequity how can we address it?

DBNov 2, 2020
Budget Sharing for Multi-Analyst Differential Privacy

David Pujol, Yikai Wu, Brandon Fain et al.

Large organizations that collect data about populations (like the US Census Bureau) release summary statistics that are used by multiple stakeholders for resource allocation and policy making problems. These organizations are also legally required to protect the privacy of individuals from whom they collect data. Differential Privacy (DP) provides a solution to release useful summary data while preserving privacy. Most DP mechanisms are designed to answer a single set of queries. In reality, there are often multiple stakeholders that use a given data release and have overlapping but not-identical queries. This introduces a novel joint optimization problem in DP where the privacy budget must be shared among different analysts. We initiate study into the problem of DP query answering across multiple analysts. To capture the competing goals and priorities of multiple analysts, we formulate three desiderata that any mechanism should satisfy in this setting -- The Sharing Incentive, Non-Interference, and Adaptivity -- while still optimizing for overall error. We demonstrate how existing DP query answering mechanisms in the multi-analyst settings fail to satisfy at least one of the desiderata. We present novel DP algorithms that provably satisfy all our desiderata and empirically show that they incur low error on realistic tasks.

DBAug 27, 2019
Answering Summation Queries for Numerical Attributes under Differential Privacy

Yikai Wu, David Pujol, Ios Kotsogiannis et al.

In this work we explore the problem of answering a set of sum queries under Differential Privacy. This is a little understood, non-trivial problem especially in the case of numerical domains. We show that traditional techniques from the literature are not always the best choice and a more rigorous approach is necessary to develop low error algorithms.