Michael Stiber

2papers

2 Papers

CYFeb 16, 2022
Trusted Data Forever: Is AI the Answer?

Emanuele Frontoni, Marina Paolanti, Tracey P. Lauriault et al.

Archival institutions and programs worldwide work to ensure that the records of governments, organizations, communities, and individuals are preserved for future generations as cultural heritage, as sources of rights, and as vehicles for holding the past accountable and to inform the future. This commitment is guaranteed through the adoption of strategic and technical measures for the long-term preservation of digital assets in any medium and form - textual, visual, or aural. Public and private archives are the largest providers of data big and small in the world and collectively host yottabytes of trusted data, to be preserved forever. Several aspects of retention and preservation, arrangement and description, management and administrations, and access and use are still open to improvement. In particular, recent advances in Artificial Intelligence (AI) open the discussion as to whether AI can support the ongoing availability and accessibility of trustworthy public records. This paper presents preliminary results of the InterPARES Trust AI (I Trust AI) international research partnership, which aims to (1) identify and develop specific AI technologies to address critical records and archives challenges; (2) determine the benefits and risks of employing AI technologies on records and archives; (3) ensure that archival concepts and principles inform the development of responsible AI; and (4) validate outcomes through a conglomerate of case studies and demonstrations.

CRJan 25, 2021
Privacy Preserving Techniques Applied to CPNI Data: Analysis and Recommendations

Jeffrey Murray, Afra Mashhadi, Brent Lagesse et al.

With mobile phone penetration rates reaching 90%, Consumer Proprietary Network Information (CPNI) can offer extremely valuable information to different sectors, including policymakers. Indeed, as part of CPNI, Call Detail Records have been successfully used to provide real-time traffic information, to improve our understanding of the dynamics of people's mobility and so to allow prevention and measures in fighting infectious diseases, and to offer population statistics. While there is no doubt of the usefulness of CPNI data, privacy concerns regarding sharing individuals' data have prevented it from being used to its full potential. Traditional de-anonymization measures, such as pseudonymization and standard de-identification, have been shown to be insufficient to protect privacy. This has been specifically shown on mobile phone datasets. As an example, researchers have shown that with only four data points of approximate place and time information of a user, 95% of users could be re-identified in a dataset of 1.5 million mobile phone users. In this landscape paper, we will discuss the state-of-the-art anonymization techniques and their shortcomings.