Boosting the Learning for Ranking Patterns
This work addresses the challenge of efficiently discovering relevant patterns for users in data mining, offering an incremental improvement over existing methods.
The paper tackles the problem of learning user-specific pattern ranking functions by formulating it as a multicriteria decision-making problem, aggregating interestingness measures into a weighted linear function using an interactive learning procedure based on Analytic Hierarchy Process (AHP). Experiments show it significantly reduces running time and returns precise rankings while being robust to user errors compared to state-of-the-art approaches.
Discovering relevant patterns for a particular user remains a challenging tasks in data mining. Several approaches have been proposed to learn user-specific pattern ranking functions. These approaches generalize well, but at the expense of the running time. On the other hand, several measures are often used to evaluate the interestingness of patterns, with the hope to reveal a ranking that is as close as possible to the user-specific ranking. In this paper, we formulate the problem of learning pattern ranking functions as a multicriteria decision making problem. Our approach aggregates different interestingness measures into a single weighted linear ranking function, using an interactive learning procedure that operates in either passive or active modes. A fast learning step is used for eliciting the weights of all the measures by mean of pairwise comparisons. This approach is based on Analytic Hierarchy Process (AHP), and a set of user-ranked patterns to build a preference matrix, which compares the importance of measures according to the user-specific interestingness. A sensitivity based heuristic is proposed for the active learning mode, in order to insure high quality results with few user ranking queries. Experiments conducted on well-known datasets show that our approach significantly reduces the running time and returns precise pattern ranking, while being robust to user-error compared with state-of-the-art approaches.