DBAIAug 27, 2022

A Generic Algorithm for Top-K On-Shelf Utility Mining

arXiv:2208.14230v11 citationsh-index: 167
Originality Incremental advance
AI Analysis

This work addresses a practical issue in data mining for retail and business applications by making pattern discovery more user-friendly and efficient, though it is incremental relative to existing methods.

The paper tackles the difficulty of setting a minimum utility threshold in on-shelf utility mining by proposing a generic algorithm, TOIT, for mining top-k high-utility patterns, which outperforms the state-of-the-art KOSHU algorithm in running time and memory consumption on real datasets.

On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. Compared with traditional utility mining, OSUM can find more practical and meaningful patterns in real-life applications. However, there is a major drawback to traditional OSUM. For normal users, it is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. On one hand, if the threshold is set too high, the number of patterns would not be enough. On the other hand, if the threshold is set too low, too many patterns will be discovered and cause an unnecessary waste of time and memory consumption. To address this issue, the user usually directly specifies a parameter k, where only the top-k high relative utility itemsets would be considered. Therefore, in this paper, we propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns to solve this problem. TOIT applies a novel strategy to raise the minutil based on the on-shelf datasets. Besides, two novel upper-bound strategies named subtree utility and local utility are applied to prune the search space. By adopting the strategies mentioned above, the TOIT algorithm can narrow the search space as early as possible, improve the mining efficiency, and reduce the memory consumption, so it can obtain better performance than other algorithms. A series of experiments have been conducted on real datasets with different styles to compare the effects with the state-of-the-art KOSHU algorithm. The experimental results showed that TOIT outperforms KOSHU in both running time and memory consumption.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes