AIDBLGJul 26, 2012

On When and How to use SAT to Mine Frequent Itemsets

arXiv:1207.6253v113 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently mining frequent itemsets for data mining practitioners, but it is incremental as it builds on existing constraint programming approaches.

The paper tackled the problem of frequent itemset mining by exploring when and how to use SAT encodings, showing that while SAT is often non-competitive with constraint programming, it performs best in specific scenarios.

A new stream of research was born in the last decade with the goal of mining itemsets of interest using Constraint Programming (CP). This has promoted a natural way to combine complex constraints in a highly flexible manner. Although CP state-of-the-art solutions formulate the task using Boolean variables, the few attempts to adopt propositional Satisfiability (SAT) provided an unsatisfactory performance. This work deepens the study on when and how to use SAT for the frequent itemset mining (FIM) problem by defining different encodings with multiple task-driven enumeration options and search strategies. Although for the majority of the scenarios SAT-based solutions appear to be non-competitive with CP peers, results show a variety of interesting cases where SAT encodings are the best option.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes