Theresa Stadler

h-index10

4papers

602citations

Novelty46%

AI Score36

Ranked #96,733 of 194,257 authors (top 50%)#2,364 in CR (top 35%)

4 Papers

6.4LGFeb 19, 2024

The Fundamental Limits of Least-Privilege Learning

Theresa Stadler, Bogdan Kulynych, Michael C. Gastpar et al.

The promise of least-privilege learning -- to find feature representations that are useful for a learning task but prevent inference of any sensitive information unrelated to this task -- is highly appealing. However, so far this concept has only been stated informally. It thus remains an open question whether and how we can achieve this goal. In this work, we provide the first formalisation of the least-privilege principle for machine learning and characterise its feasibility. We prove that there is a fundamental trade-off between a representation's utility for a given task and its leakage beyond the intended task: it is not possible to learn representations that have high utility for the intended task but, at the same time prevent inference of any attribute other than the task label itself. This trade-off holds under realistic assumptions on the data distribution and regardless of the technique used to learn the feature mappings that produce these representations. We empirically validate this result for a wide range of learning techniques, model architectures, and datasets.

6.6CRMar 22, 2021

Preliminary Analysis of Potential Harms in the Luca Tracing System

Theresa Stadler, Wouter Lueks, Katharina Kohls et al.

In this document, we analyse the potential harms a large-scale deployment of the Luca system might cause to individuals, venues, and communities. The Luca system is a digital presence tracing system designed to provide health departments with the contact information necessary to alert individuals who have visited a location at the same time as a SARS-CoV-2-positive person. Multiple regional health departments in Germany have announced their plans to deploy the Luca system for the purpose of presence tracing. The system's developers suggest its use across various types of venues: from bars and restaurants to public and private events, such religious or political gatherings, weddings, and birthday parties. Recently, an extension to include schools and other educational facilities was discussed in public. Our analysis of the potential harms of the system is based on the publicly available Luca Security Concept which describes the system's security architecture and its planned protection mechanisms. The Security Concept furthermore provides a set of claims about the system's security and privacy properties. Besides an analysis of harms, our analysis includes a validation of these claims.

26.4LGNov 13, 2020Code

Synthetic Data -- Anonymisation Groundhog Day

Theresa Stadler, Bristena Oprisanu, Carmela Troncoso

Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing that addresses the shortcomings of traditional anonymisation techniques. The promise is that synthetic data drawn from generative models preserves the statistical properties of the original dataset but, at the same time, provides perfect protection against privacy attacks. In this work, we present the first quantitative evaluation of the privacy gain of synthetic data publishing and compare it to that of previous anonymisation techniques. Our evaluation of a wide range of state-of-the-art generative models demonstrates that synthetic data either does not prevent inference attacks or does not retain data utility. In other words, we empirically show that synthetic data does not provide a better tradeoff between privacy and utility than traditional anonymisation techniques. Furthermore, in contrast to traditional anonymisation, the privacy-utility tradeoff of synthetic data publishing is hard to predict. Because it is impossible to predict what signals a synthetic dataset will preserve and what information will be lost, synthetic data leads to a highly variable privacy gain and unpredictable utility loss. In summary, we find that synthetic data is far from the holy grail of privacy-preserving data publishing.

35.0CRMay 25, 2020Code

Decentralized Privacy-Preserving Proximity Tracing

Carmela Troncoso, Mathias Payer, Jean-Pierre Hubaux et al.

This document describes and analyzes a system for secure and privacy-preserving proximity tracing at large scale. This system, referred to as DP3T, provides a technological foundation to help slow the spread of SARS-CoV-2 by simplifying and accelerating the process of notifying people who might have been exposed to the virus so that they can take appropriate measures to break its transmission chain. The system aims to minimise privacy and security risks for individuals and communities and guarantee the highest level of data protection. The goal of our proximity tracing system is to determine who has been in close physical proximity to a COVID-19 positive person and thus exposed to the virus, without revealing the contact's identity or where the contact occurred. To achieve this goal, users run a smartphone app that continually broadcasts an ephemeral, pseudo-random ID representing the user's phone and also records the pseudo-random IDs observed from smartphones in close proximity. When a patient is diagnosed with COVID-19, she can upload pseudo-random IDs previously broadcast from her phone to a central server. Prior to the upload, all data remains exclusively on the user's phone. Other users' apps can use data from the server to locally estimate whether the device's owner was exposed to the virus through close-range physical proximity to a COVID-19 positive person who has uploaded their data. In case the app detects a high risk, it will inform the user.