On the Challenges of Deploying Privacy-Preserving Synthetic Data in the Enterprise
This work addresses privacy concerns for enterprises using synthetic data, but it is incremental as it systematizes known challenges without introducing new methods.
The paper tackles the challenges of deploying privacy-preserving synthetic data in enterprises, identifying over 40 issues across five groups such as generation and compliance, and proposes a systematic approach to address them and build trust in solutions.
Generative AI technologies are gaining unprecedented popularity, causing a mix of excitement and apprehension through their remarkable capabilities. In this paper, we study the challenges associated with deploying synthetic data, a subfield of Generative AI. Our focus centers on enterprise deployment, with an emphasis on privacy concerns caused by the vast amount of personal and highly sensitive data. We identify 40+ challenges and systematize them into five main groups -- i) generation, ii) infrastructure & architecture, iii) governance, iv) compliance & regulation, and v) adoption. Additionally, we discuss a strategic and systematic approach that enterprises can employ to effectively address the challenges and achieve their goals by establishing trust in the implemented solutions.