Towards Improving Solution Dominance with Incomparability Conditions: A case-study using Generator Itemset Mining
This work addresses efficiency issues in data mining for researchers and practitioners, but it is incremental as it builds on existing dominance programming frameworks.
The paper tackles the challenge of improving efficiency in constraint-based pattern mining by introducing incomparability conditions alongside dominance relations, enabling a batch-wise search that avoids checking incomparable solutions. Preliminary experiments on generator itemset mining show that this approach reduces the need for post-processing to filter dominated solutions.
Finding interesting patterns is a challenging task in data mining. Constraint based mining is a well-known approach to this, and one for which constraint programming has been shown to be a well-suited and generic framework. Dominance programming has been proposed as an extension that can capture an even wider class of constraint-based mining problems, by allowing to compare relations between patterns. In this paper, in addition to specifying a dominance relation, we introduce the ability to specify an incomparability condition. Using these two concepts we devise a generic framework that can do a batch-wise search that avoids checking incomparable solutions. We extend the ESSENCE language and underlying modelling pipeline to support this. We use generator itemset mining problem as a test case and give a declarative specification for that. We also present preliminary experimental results on this specific problem class with a CP solver backend to show that using the incomparability condition during search can improve the efficiency of dominance programming and reduces the need for post-processing to filter dominated solutions.