A global constraint for closed itemset mining
This work addresses a bottleneck in data mining for researchers and practitioners by improving the scalability of constraint programming approaches for pattern mining.
The paper tackled the challenge of efficiently mining closed frequent patterns in high-dimensional datasets by proposing a global constraint that avoids reified constraints and extra variables, resulting in a polynomial pruning algorithm that ensures domain consistency.
Discovering the set of closed frequent patterns is one of the fundamental problems in Data Mining. Recent Constraint Programming (CP) approaches for declarative itemset mining have proven their usefulness and flexibility. But the wide use of reified constraints in current CP approaches raises many difficulties to cope with high dimensional datasets. This paper proposes CLOSED PATTERN global constraint which does not require any reified constraints nor any extra variables to encode efficiently the Closed Frequent Pattern Mining (CFPM) constraint. CLOSED-PATTERN captures the particular semantics of the CFPM problem in order to ensure a polynomial pruning algorithm ensuring domain consistency. The computational properties of our constraint are analyzed and their practical effectiveness is experimentally evaluated.