LG MLJul 4, 2020

Building a Competitive Associative Classifier

arXiv:2007.01972v11.28 citations

Originality Incremental advance

AI Analysis

This addresses the issue of interpretability and competitiveness for rule-based classifiers in scenarios with limited labeled data, though it is incremental as it builds on existing pruning and ensemble techniques.

The authors tackled the problem of rule-based classifiers producing too many rules, which reduces readability and accuracy, by proposing SigD2 with a two-stage pruning strategy and ensemble methods like bagging and boosting. The result showed that SigD2 and boosted SigDirect outperformed eight state-of-the-art classifiers in accuracy and rule count on 15 UCI datasets.

With the huge success of deep learning, other machine learning paradigms have had to take back seat. Yet other models, particularly rule-based, are more readable and explainable and can even be competitive when labelled data is not abundant. However, most of the existing rule-based classifiers suffer from the production of a large number of classification rules, affecting the model readability. This hampers the classification accuracy as noisy rules might not add any useful informationfor classification and also lead to longer classification time. In this study, we propose SigD2 which uses a novel, two-stage pruning strategy which prunes most of the noisy, redundant and uninteresting rules and makes the classification model more accurate and readable. To make SigDirect more competitive with the most prevalent but uninterpretable machine learning-based classifiers like neural networks and support vector machines, we propose bagging and boosting on the ensemble of the SigDirect classifier. The results of the proposed algorithms are quite promising and we are able to obtain a minimal set of statistically significant rules for classification without jeopardizing the classification accuracy. We use 15 UCI datasets and compare our approach with eight existing systems.The SigD2 and boosted SigDirect (ACboost) ensemble model outperform various state-of-the-art classifiers not only in terms of classification accuracy but also in terms of the number of rules.

View on arXiv PDF

Similar