CRDBFeb 28, 2018

A Frequent Itemset Hiding Toolbox

arXiv:1802.10543v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses privacy concerns for companies and organizations sharing transactional data, but it is incremental as it focuses on tool implementation rather than new algorithmic breakthroughs.

The paper tackles the problem of protecting sensitive frequent itemsets in transactional databases from privacy risks during data sharing by presenting a toolbox that implements several hiding algorithms, demonstrating its efficiency and convenience for comparing algorithms through experiments on real-world datasets.

Advances in data collection and data storage technologies have given way to the establishment of transactional databases among companies and organizations, as they allow enormous amounts of data to be stored efficiently. Useful knowledge can be mined from these data, which can be used in several ways depending on the nature of the data. Quite often companies and organizations are willing to share data for the sake of mutual benefit. However, the sharing of such data comes with risks, as problems with privacy may arise. Sensitive data, along with sensitive knowledge inferred from this data, must be protected from unintentional exposure to unauthorized parties. One form of the inferred knowledge is frequent patterns mined in the form of frequent itemsets from transactional databases. The problem of protecting such patterns is known as the frequent itemset hiding problem. In this paper we present a toolbox, which provides several implementations of frequent itemset hiding algorithms. Firstly, we summarize the most important aspects of each algorithm. We then introduce the architecture of the toolbox and its novel features. Finally, we provide experimental results on real world datasets, demonstrating the efficiency of the toolbox and the convenience it offers in comparing different algorithms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes