CRJun 11, 2016

Differentially Private Random Decision Forests using Smooth Sensitivity

arXiv:1606.03572v45 citations
AI Analysis

This addresses privacy-preserving machine learning for sensitive data, offering an incremental improvement over existing methods.

The paper tackles the problem of building differentially private decision forests by proposing a method that reduces query sensitivity using smooth sensitivity and the Exponential Mechanism, resulting in substantially higher predictive power than the state-of-the-art.

We propose a new differentially-private decision forest algorithm that minimizes both the number of queries required, and the sensitivity of those queries. To do so, we build an ensemble of random decision trees that avoids querying the private data except to find the majority class label in the leaf nodes. Rather than using a count query to return the class counts like the current state-of-the-art, we use the Exponential Mechanism to only output the class label itself. This drastically reduces the sensitivity of the query -- often by several orders of magnitude -- which in turn reduces the amount of noise that must be added to preserve privacy. Our improved sensitivity is achieved by using "smooth sensitivity", which takes into account the specific data used in the query rather than assuming the worst-case scenario. We also extend work done on the optimal depth of random decision trees to handle continuous features, not just discrete features. This, along with several other improvements, allows us to create a differentially private decision forest with substantially higher predictive power than the current state-of-the-art.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes