Semantics of negative sequential patterns
This work addresses foundational issues in pattern mining for researchers, but it is incremental as it clarifies existing concepts without introducing new methods or broad applications.
The paper tackles the ambiguity in defining negative sequential patterns by identifying and formally studying eight possible semantics for pattern containment, and proves that support is anti-monotonic for some of these semantics.
In the field of pattern mining, a negative sequential pattern is specified by means of a sequence consisting of events to occur and of other events, called negative events, to be absent. For instance, containment of the pattern $\langle a\ \neg b\ c\rangle$ arises with an occurrence of a and a subsequent occurrence of c but no occurrence of b in between. This article is to shed light on the ambiguity of such a seemingly intuitive notation and we identify eight possible semantics for the containment relation between a pattern and a sequence. These semantics are illustrated and formally studied, in particular we propose dominance and equivalence relations between them. Also we prove that support is anti-monotonic for some of these semantics. Some of the results are discussed with the aim of developing algorithms to extract efficiently frequent negative patterns.