k-anonymous Microdata Release via Post Randomisation Method
This work addresses privacy concerns in statistical disclosure control for data publishers, offering a method to enhance existing algorithms with dual privacy guarantees, though it appears incremental in extending known concepts.
The paper tackles the problem of anonymizing microdata for privacy-preserving data publishing by proposing Pk-anonymity, a new probabilistic extension of k-anonymity that requires no parametric assumptions, and applies it to the post randomization method (PRAM), proving that PRAM can satisfy both Pk-anonymity and ε-differential privacy.
The problem of the release of anonymized microdata is an important topic in the fields of statistical disclosure control (SDC) and privacy preserving data publishing (PPDP), and yet it remains sufficiently unsolved. In these research fields, k-anonymity has been widely studied as an anonymity notion for mainly deterministic anonymization algorithms, and some probabilistic relaxations have been developed. However, they are not sufficient due to their limitations, i.e., being weaker than the original k-anonymity or requiring strong parametric assumptions. First we propose Pk-anonymity, a new probabilistic k-anonymity, and prove that Pk-anonymity is a mathematical extension of k-anonymity rather than a relaxation. Furthermore, Pk-anonymity requires no parametric assumptions. This property has a significant meaning in the viewpoint that it enables us to compare privacy levels of probabilistic microdata release algorithms with deterministic ones. Second, we apply Pk-anonymity to the post randomization method (PRAM), which is an SDC algorithm based on randomization. PRAM is proven to satisfy Pk-anonymity in a controlled way, i.e, one can control PRAM's parameter so that Pk-anonymity is satisfied. On the other hand, PRAM is also known to satisfy ${\varepsilon}$-differential privacy, a recent popular and strong privacy notion. This fact means that our results significantly enhance PRAM since it implies the satisfaction of both important notions: k-anonymity and ${\varepsilon}$-differential privacy.