LGJun 17, 2022

Truly Unordered Probabilistic Rule Sets for Multi-class Classification

arXiv:2206.08804v310.417 citationsh-index: 24Has Code

Originality Incremental advance

AI Analysis

This work provides a more interpretable and effective method for multi-class classification, addressing specific bottlenecks in rule set learning, though it is incremental in nature.

The authors tackled the problem of learning interpretable rule sets for multi-class classification by addressing shortcomings like reliance on binary inputs, imposed rule orders, and lack of probabilistic handling, resulting in TURS, which improved both interpretability and predictive performance compared to state-of-the-art methods.

Rule set learning has long been studied and has recently been frequently revisited due to the need for interpretable models. Still, existing methods have several shortcomings: 1) most recent methods require a binary feature matrix as input, while learning rules directly from numeric variables is understudied; 2) existing methods impose orders among rules, either explicitly or implicitly, which harms interpretability; and 3) currently no method exists for learning probabilistic rule sets for multi-class target variables (there is only one for probabilistic rule lists). We propose TURS, for Truly Unordered Rule Sets, which addresses these shortcomings. We first formalize the problem of learning truly unordered rule sets. To resolve conflicts caused by overlapping rules, i.e., instances covered by multiple rules, we propose a novel approach that exploits the probabilistic properties of our rule sets. We next develop a two-phase heuristic algorithm that learns rule sets by carefully growing rules. An important innovation is that we use a surrogate score to take the global potential of the rule set into account when learning a local rule. Finally, we empirically demonstrate that, compared to non-probabilistic and (explicitly or implicitly) ordered state-of-the-art methods, our method learns rule sets that not only have better interpretability but also better predictive performance.

View on arXiv PDF Code

Similar