An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns
This work addresses the challenge of pattern set mining for data analysts by providing a more efficient solution, though it appears incremental as it builds on existing genetic algorithm approaches.
The paper tackles the problem of efficiently mining diverse frequent patterns from large datasets by proposing a fast heuristic search algorithm based on a genetic algorithm, which outperforms state-of-the-art methods on standard benchmarks and achieves satisfactory results quickly.
Working with exhaustive search on large dataset is infeasible for several reasons. Recently, developed techniques that made pattern set mining feasible by a general solver with long execution time that supports heuristic search and are limited to small datasets only. In this paper, we investigate an approach which aims to find diverse set of patterns using genetic algorithm to mine diverse frequent patterns. We propose a fast heuristic search algorithm that outperforms state-of-the-art methods on a standard set of benchmarks and capable to produce satisfactory results within a short period of time. Our proposed algorithm uses a relative encoding scheme for the patterns and an effective twin removal technique to ensure diversity throughout the search.