Data set operations to hide decision tree rules
This addresses privacy concerns for data owners when sharing binary datasets, though it appears incremental as it builds on existing hiding methodologies.
The paper tackles the problem of preserving privacy of sensitive patterns when inducing decision trees by adopting a record augmentation approach to hide sensitive classification rules in binary datasets, demonstrating the methodology with an example and an indicative experiment using a prototype hiding tool.
This paper focuses on preserving the privacy of sensitive patterns when inducing decision trees. We adopt a record augmentation approach for hiding sensitive classification rules in binary datasets. Such a hiding methodology is preferred over other heuristic solutions like output perturbation or cryptographic techniques - which restrict the usability of the data - since the raw data itself is readily available for public use. We show some key lemmas which are related to the hiding process and we also demonstrate the methodology with an example and an indicative experiment using a prototype hiding tool.