CRJul 24, 2019

Privacy Parameter Variation Using RAPPOR on a Malware Dataset

Peter Aaby, Juanjo Mata De Acuna, Richard Macfarlane, William J Buchanan

arXiv:1907.10387v16.82 citations

Originality Synthesis-oriented

AI Analysis

This work addresses privacy-utility trade-offs for data-driven companies handling sensitive user data, but it is incremental as it applies an existing method to new data.

The paper investigates how varying privacy parameters in the RAPPOR method affects utility on a malware dataset, finding that higher privacy settings reduce accuracy but remain usable for analysis.

Stricter data protection regulations and the poor application of privacy protection techniques have resulted in a requirement for data-driven companies to adopt new methods of analysing sensitive user data. The RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) method adds parameterised noise, which must be carefully selected to maintain adequate privacy without losing analytical value. This paper applies RAPPOR privacy parameter variations against a public dataset containing a list of running Android applications data. The dataset is filtered and sampled into small (10,000); medium (100,000); and large (1,200,000) sample sizes while applying RAPPOR with ? = 10; 1.0; and 0.1 (respectively low; medium; high privacy guarantees). Also, in order to observe detailed variations within high to medium privacy guarantees (? = 0.5 to 1.0), a second experiment is conducted by progressively.

View on arXiv PDF

Similar