CRIRGNJul 15, 2019

Confidentiality and linked data

arXiv:1907.06465v13 citations
Originality Synthesis-oriented
AI Analysis

This addresses privacy risks for data providers like government agencies when sharing linked datasets, but is incremental as it reviews existing principles and methods.

This article examines the challenge of balancing information publication with privacy protection when linking identified administrative datasets across sources and time, focusing on confidentiality risks from data outputs and micro-data release.

Data providers such as government statistical agencies perform a balancing act: maximising information published to inform decision-making and research, while simultaneously protecting privacy. The emergence of identified administrative datasets with the potential for sharing (and thus linking) offers huge potential benefits but significant additional risks. This article introduces the principles and methods of linking data across different sources and points in time, focusing on potential areas of risk. We then consider confidentiality risk, focusing in particular on the "intruder" problem central to the area, and looking at both risks from data producer outputs and from the release of micro-data for further analysis. Finally, we briefly consider potential solutions to micro-data release, both the statistical solutions considered in other contributed articles and non-statistical solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes