dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation
This work provides a practical tool for researchers and practitioners needing privacy-preserving synthetic data, though it is incremental as it packages existing methods into a library.
The authors tackled the problem of generating synthetic tabular data with differential privacy guarantees by developing an open-source library, dpmm, which includes three marginal models (PrivBayes, MST, and AIM) that achieve superior utility and offer rich functionality compared to alternatives.
We propose dpmm, an open-source library for synthetic data generation with Differentially Private (DP) guarantees. It includes three popular marginal models -- PrivBayes, MST, and AIM -- that achieve superior utility and offer richer functionality compared to alternative implementations. Additionally, we adopt best practices to provide end-to-end DP guarantees and address well-known DP-related vulnerabilities. Our goal is to accommodate a wide audience with easy-to-install, highly customizable, and robust model implementations. Our codebase is available from https://github.com/sassoftware/dpmm.