Privacy Preserving Techniques Applied to CPNI Data: Analysis and Recommendations
This is an incremental landscape paper that reviews existing anonymization methods for CPNI data, which is valuable for policymakers and researchers but limited by privacy issues.
The paper addresses privacy concerns in using Consumer Proprietary Network Information (CPNI) data, such as Call Detail Records, by analyzing state-of-the-art anonymization techniques and their shortcomings, noting that traditional methods like pseudonymization are insufficient as shown by re-identification risks in mobile datasets.
With mobile phone penetration rates reaching 90%, Consumer Proprietary Network Information (CPNI) can offer extremely valuable information to different sectors, including policymakers. Indeed, as part of CPNI, Call Detail Records have been successfully used to provide real-time traffic information, to improve our understanding of the dynamics of people's mobility and so to allow prevention and measures in fighting infectious diseases, and to offer population statistics. While there is no doubt of the usefulness of CPNI data, privacy concerns regarding sharing individuals' data have prevented it from being used to its full potential. Traditional de-anonymization measures, such as pseudonymization and standard de-identification, have been shown to be insufficient to protect privacy. This has been specifically shown on mobile phone datasets. As an example, researchers have shown that with only four data points of approximate place and time information of a user, 95% of users could be re-identified in a dataset of 1.5 million mobile phone users. In this landscape paper, we will discuss the state-of-the-art anonymization techniques and their shortcomings.