CR AIFeb 5

FHAIM: Fully Homomorphic AIM For Private Synthetic Data Generation

Mayank Kumar, Qian Lou, Paulo Barreto, Martine De Cock, Sikha Pentyala

arXiv:2602.05838v210.33 citationsh-index: 9

Originality Incremental advance

AI Analysis

This work provides a solution for data holders in sensitive domains like healthcare, education, and finance to leverage synthetic data generation without trusting third-party providers with their private data, which is an incremental step towards privacy-preserving AI.

This paper addresses the challenge of generating synthetic data from private tabular datasets without exposing the raw data. They developed FHAIM, a fully homomorphic encryption (FHE) framework that adapts the AIM algorithm to train a marginal-based synthetic data generator on encrypted data, ensuring privacy throughout the process.

Data is the lifeblood of AI, yet much of the most valuable data remains locked in silos due to privacy and regulations. As a result, AI remains heavily underutilized in many of the most important domains, including healthcare, education, and finance. Synthetic data generation (SDG), i.e. the generation of artificial data with a synthesizer trained on real data, offers an appealing solution to make data available while mitigating privacy concerns, however existing SDG-as-a-service workflow require data holders to trust providers with access to private data. We propose FHAIM, the first fully homomorphic encryption (FHE) framework for training a marginal-based synthetic data generator on encrypted tabular data. FHAIM adapts the widely used AIM algorithm to the FHE setting using novel FHE protocols, ensuring that the private data remains encrypted throughout and is released only with differential privacy guarantees. Our empirical analysis show that FHAIM preserves the performance of AIM while maintaining feasible runtimes.

View on arXiv PDF

Similar