CRITLGMEJun 21, 2022

Differentially Private Maximal Information Coefficients

arXiv:2206.10685v13 citationsh-index: 21Has Code
Originality Incremental advance
AI Analysis

This work addresses privacy concerns for statisticians and data analysts using MIC on sensitive datasets, though it is incremental as it builds on existing differential privacy methods.

The authors tackled the problem of computing the Maximal Information Coefficient (MIC) on sensitive data without leaking private information by introducing differentially private algorithms, specifically the MICr statistic, which outperforms the Laplace mechanism and provides usable accuracy on moderately large datasets.

The Maximal Information Coefficient (MIC) is a powerful statistic to identify dependencies between variables. However, it may be applied to sensitive data, and publishing it could leak private information. As a solution, we present algorithms to approximate MIC in a way that provides differential privacy. We show that the natural application of the classic Laplace mechanism yields insufficient accuracy. We therefore introduce the MICr statistic, which is a new MIC approximation that is more compatible with differential privacy. We prove MICr is a consistent estimator for MIC, and we provide two differentially private versions of it. We perform experiments on a variety of real and synthetic datasets. The results show that the private MICr statistics significantly outperform direct application of the Laplace mechanism. Moreover, experiments on real-world datasets show accuracy that is usable when the sample size is at least moderately large.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes