Daniele Panfilo

AI
3papers
21citations
Novelty27%
AI Score33

3 Papers

LGNov 30, 2022
Generating Realistic Synthetic Relational Data through Graph Variational Autoencoders

Ciro Antonio Mami, Andrea Coser, Eric Medvet et al.

Synthetic data generation has recently gained widespread attention as a more reliable alternative to traditional data anonymization. The involved methods are originally developed for image synthesis. Hence, their application to the typically tabular and relational datasets from healthcare, finance and other industries is non-trivial. While substantial research has been devoted to the generation of realistic tabular datasets, the study of synthetic relational databases is still in its infancy. In this paper, we combine the variational autoencoder framework with graph neural networks to generate realistic synthetic relational databases. We then apply the obtained method to two publicly available databases in computational experiments. The results indicate that real databases' structures are accurately preserved in the resulting synthetic datasets, even for large datasets with advanced data types.

29.0CRApr 2
Empirical Evaluation of Structured Synthetic Data Privacy Metrics: Novel experimental framework

Milton Nicolás Plasencia Palacios, Alexander Boudewijn, Sebastiano Saccani et al.

Synthetic data generation is gaining traction as a privacy enhancing technology (PET). When properly generated, synthetic data preserve the analytic utility of real data while avoiding the retention of information that would allow the identification of specific individuals. However, the concept of data privacy remains elusive, making it challenging for practitioners to evaluate and benchmark the degree of privacy protection offered by synthetic data. In this paper, we propose a framework to empirically assess the efficacy of tabular synthetic data privacy quantification methods through controlled, deliberate risk insertion. To demonstrate this framework, we survey existing approaches to synthetic data privacy quantification and the related legal theory. We then apply the framework to the main privacy quantification methods with no-box threat models on publicly available datasets.

AINov 29, 2023
Privacy Measurement in Tabular Synthetic Data: State of the Art and Future Research Directions

Alexander Boudewijn, Andrea Filippo Ferraris, Daniele Panfilo et al.

Synthetic data (SD) have garnered attention as a privacy enhancing technology. Unfortunately, there is no standard for quantifying their degree of privacy protection. In this paper, we discuss proposed quantification approaches. This contributes to the development of SD privacy standards; stimulates multi-disciplinary discussion; and helps SD researchers make informed modeling and evaluation decisions.