IRDBJan 23, 2019

Boosting Frequent Itemset Mining via Early Stopping Intersections

arXiv:1901.07773v11 citations
Originality Synthesis-oriented
AI Analysis

This work provides an incremental improvement for data mining practitioners by speeding up existing frequent itemset mining algorithms.

The paper tackles the problem of reducing support checking time in frequent itemset mining algorithms by introducing an early-stopping technique to detect infrequent candidates early, resulting in runtime reduction across various datasets.

Mining frequent itemsets from a transaction database has emerged as a fundamental problem in data mining and committed itself as a building block for many pattern mining tasks. In this paper, we present a general technique to reduce support checking time in existing depth-first search generate-and-test schemes such as Eclat/dEclat and PrePost+. Our technique allows infrequent candidate itemsets to be detected early. The technique is based on an early-stopping criterion and is general enough to be applicable in many frequent itemset mining algorithms. We have applied the technique to two TID-list based schemes (Eclat/dEclat) and one N-list based scheme (PrePost+). Our technique has been tested over a variety of datasets and confirmed its effectiveness in runtime reduction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes