Differentially Private Learning of Undirected Graphical Models using Collective Graphical Models
This addresses privacy-preserving machine learning for sensitive data like human mobility, offering an incremental improvement over existing methods.
The paper tackles the problem of learning discrete, undirected graphical models with differential privacy, showing that a naive approach using noisy sufficient statistics outperforms general methods but has limitations, and proposes a more principled method using collective graphical models and expectation-maximization that learns better models on synthetic and real human mobility data.
We investigate the problem of learning discrete, undirected graphical models in a differentially private way. We show that the approach of releasing noisy sufficient statistics using the Laplace mechanism achieves a good trade-off between privacy, utility, and practicality. A naive learning algorithm that uses the noisy sufficient statistics "as is" outperforms general-purpose differentially private learning algorithms. However, it has three limitations: it ignores knowledge about the data generating process, rests on uncertain theoretical foundations, and exhibits certain pathologies. We develop a more principled approach that applies the formalism of collective graphical models to perform inference over the true sufficient statistics within an expectation-maximization framework. We show that this learns better models than competing approaches on both synthetic data and on real human mobility data used as a case study.