CRDec 7, 2017

A multiplicative masking method for preserving the skewness of the original micro-records

arXiv:1712.02549v14 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better data privacy methods in fields like economics and statistics, where skewed distributions (e.g., household income) are common, though it is incremental as it builds on existing masking techniques.

The paper tackles the problem of preserving skewness in continuous microdata during safe dissemination, which existing methods often fail to do due to assumptions of normality, and presents a multiplicative masking method that preserves skewness while controlling disclosure risk, with numerical examples suggesting its broad applicability.

Masking methods for the safe dissemination of microdata consist of distorting the original data while preserving a pre-defined set of statistical properties in the microdata. For continuous variables, available methodologies rely essentially on matrix masking and in particular on adding noise to the original values, using more or less refined procedures depending on the extent of information that one seeks to preserve. Almost all of these methods make use of the critical assumption that the original datasets follow a normal distribution and/or that the noise has such a distribution. This assumption is, however, restrictive in the sense that few variables follow empirically a Gaussian pattern: the distribution of household income, for example, is positively skewed, and this skewness is essential information that has to be considered and preserved. This paper addresses these issues by presenting a simple multiplicative masking method that preserves skewness of the original data while offering a sufficient level of disclosure risk control. Numerical examples are provided, leading to the suggestion that this method could be well-suited for the dissemination of a broad range of microdata, including those based on administrative and business records.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes