IT LG MLOct 17, 2019

Obfuscation via Information Density Estimation

Hsiang Hsu, Shahab Asoodeh, Flavio du Pin Calmon

arXiv:1910.08109v18.613 citations

Originality Incremental advance

AI Analysis

This work addresses privacy protection in data analysis by providing a data-driven pipeline for obfuscating sensitive features, though it appears incremental as it builds on existing obfuscation methods with a novel estimator.

The paper tackles the problem of identifying features that leak sensitive information by proposing a framework that uses information density estimation to detect such features and applies a targeted obfuscation mechanism with provable leakage guarantees. It demonstrates the approach on three real-world datasets, achieving concrete results in terms of $\mathsf{E}_\gamma$-divergence guarantees.

Identifying features that leak information about sensitive attributes is a key challenge in the design of information obfuscation mechanisms. In this paper, we propose a framework to identify information-leaking features via information density estimation. Here, features whose information densities exceed a pre-defined threshold are deemed information-leaking features. Once these features are identified, we sequentially pass them through a targeted obfuscation mechanism with a provable leakage guarantee in terms of $\mathsf{E}_γ$-divergence. The core of this mechanism relies on a data-driven estimate of the trimmed information density for which we propose a novel estimator, named the trimmed information density estimator (TIDE). We then use TIDE to implement our mechanism on three real-world datasets. Our approach can be used as a data-driven pipeline for designing obfuscation mechanisms targeting specific features.

View on arXiv PDF

Similar