On some consequences of the permutation paradigm for data anonymization: centrality of permutation matrices, universal measures of disclosure risk and information loss, evaluation by dominance
This work addresses the problem of standardized evaluation in statistical disclosure control for researchers and practitioners, though it appears incremental as it builds on the existing permutation paradigm.
The paper tackles the challenge of comparing data anonymization methods by establishing universal measures of disclosure risk and information loss, enabling evaluation across any method and data characteristics, and introduces dominance concepts to account for varying privacy and information sensitivities among parties.
Recently, the permutation paradigm has been proposed in data anonymization to describe any micro data masking method as permutation, paving the way for performing meaningful analytical comparisons of methods, something that is difficult currently in statistical disclosure control research. This paper explores some consequences of this paradigm by establishing some class of universal measures of disclosure risk and information loss that can be used for the evaluation and comparison of any method, under any parametrization and independently of the characteristics of the data to be anonymized. These measures lead to the introduction in data anonymization of the concepts of dominance in disclosure risk and information loss, which formalise the fact that different parties involved in micro data transaction can all have different sensitivities to privacy and information.