A Nested Genetic Algorithm for Explaining Classification Data Sets with Decision Rules
This work addresses the need for interpretable machine learning explanations, but it is incremental as it builds on existing optimization and genetic algorithm methods.
The paper tackles the problem of automatically extracting concise and accurate decision rule sets to explain classification datasets by formulating it as an NP-hard optimization problem and solving it with a nested genetic algorithm, achieving results on ten public datasets.
Our goal in this paper is to automatically extract a set of decision rules (rule set) that best explains a classification data set. First, a large set of decision rules is extracted from a set of decision trees trained on the data set. The rule set should be concise, accurate, have a maximum coverage and minimum number of inconsistencies. This problem can be formalized as a modified version of the weighted budgeted maximum coverage problem, known to be NP-hard. To solve the combinatorial optimization problem efficiently, we introduce a nested genetic algorithm which we then use to derive explanations for ten public data sets.