LGCRMLAug 27, 2020

Every Query Counts: Analyzing the Privacy Loss of Exploratory Data Analyses

arXiv:2008.12282v13 citations
AI Analysis

This addresses a privacy oversight in data analysis workflows, though it is incremental by focusing on quantifying known risks.

The paper quantifies the privacy loss from basic statistical functions used in exploratory data analysis, showing that ignoring this step can significantly impact the overall privacy budget in machine learning.

An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical functions and highlight the importance of taking it into account when calculating the privacy-loss budget of a machine learning approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes