Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning
This addresses a computational bottleneck for researchers and practitioners using predictive pattern mining with structured data, representing an incremental improvement.
The paper tackles the exponential growth of patterns in predictive pattern mining for structured data by proposing the Safe Pattern Pruning (SPP) method, demonstrating its effectiveness through numerical experiments on regression and classification problems.
Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the Safe Pattern Pruning (SPP) method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences.