A global Constraint for mining Sequential Patterns with GAP constraint
This work addresses a domain-specific challenge in data mining for researchers and practitioners, offering an incremental improvement over prior methods.
The paper tackled the problem of sequential pattern mining with gap constraints by proposing a global constraint called GAP-SEQ, which outperformed existing CP approaches and the state-of-the-art cSpade method on large datasets.
Sequential pattern mining (SPM) under gap constraint is a challenging task. Many efficient specialized methods have been developed but they are all suffering from a lack of genericity. The Constraint Programming (CP) approaches are not so effective because of the size of their encodings. In[7], we have proposed the global constraint Prefix-Projection for SPM which remedies to this drawback. However, this global constraint cannot be directly extended to support gap constraint. In this paper, we propose the global constraint GAP-SEQ enabling to handle SPM with or without gap constraint. GAP-SEQ relies on the principle of right pattern extensions. Experiments show that our approach clearly outperforms both CP approaches and the state-of-the-art cSpade method on large datasets.