Neurosymbolic Association Rule Mining from Tabular Data
This addresses the challenge of managing excessive rules in association rule mining for domains using high-dimensional datasets, though it appears incremental as it builds on existing neurosymbolic approaches.
The paper tackles the problem of rule explosion in association rule mining from high-dimensional tabular data by introducing Aerial+, a neurosymbolic method that uses an under-complete autoencoder to extract concise rules. It achieves state-of-the-art results with full data coverage and reduces execution time while maintaining or improving accuracy in downstream tasks.
Association Rule Mining (ARM) is the task of mining patterns among data features in the form of logical rules, with applications across a myriad of domains. However, high-dimensional datasets often result in an excessive number of rules, increasing execution time and negatively impacting downstream task performance. Managing this rule explosion remains a central challenge in ARM research. To address this, we introduce Aerial+, a novel neurosymbolic ARM method. Aerial+ leverages an under-complete autoencoder to create a neural representation of the data, capturing associations between features. It extracts rules from this neural representation by exploiting the model's reconstruction mechanism. Extensive evaluations on five datasets against seven baselines demonstrate that Aerial+ achieves state-of-the-art results by learning more concise, high-quality rule sets with full data coverage. When integrated into rule-based interpretable machine learning models, Aerial+ significantly reduces execution time while maintaining or improving accuracy.