LGJun 29, 2022Code
TE2Rules: Explaining Tree Ensembles using RulesG Roshan Lal, Xiaotong Chen, Varun Mithal
Tree Ensemble (TE) models, such as Gradient Boosted Trees, often achieve optimal performance on tabular datasets, yet their lack of transparency poses challenges for comprehending their decision logic. This paper introduces TE2Rules (Tree Ensemble to Rules), a novel approach for explaining binary classification tree ensemble models through a list of rules, particularly focusing on explaining the minority class. Many state-of-the-art explainers struggle with minority class explanations, making TE2Rules valuable in such cases. The rules generated by TE2Rules closely approximate the original model, ensuring high fidelity, providing an accurate and interpretable means to understand decision-making. Experimental results demonstrate that TE2Rules scales effectively to tree ensembles with hundreds of trees, achieving higher fidelity within runtimes comparable to baselines. TE2Rules allows for a trade-off between runtime and fidelity, enhancing its practical applicability. The implementation is available here: https://github.com/linkedin/TE2Rules.
LGJul 4, 2022
NN2Rules: Extracting Rule List from Neural NetworksG Roshan Lal, Varun Mithal
We present an algorithm, NN2Rules, to convert a trained neural network into a rule list. Rule lists are more interpretable since they align better with the way humans make decisions. NN2Rules is a decompositional approach to rule extraction, i.e., it extracts a set of decision rules from the parameters of the trained neural network model. We show that the decision rules extracted have the same prediction as the neural network on any input presented to it, and hence the same accuracy. A key contribution of NN2Rules is that it allows hidden neuron behavior to be either soft-binary (eg. sigmoid activation) or rectified linear (ReLU) as opposed to existing decompositional approaches that were developed with the assumption of soft-binary activation.
AIJul 30, 2020Code
Fairness-Aware Online PersonalizationG Roshan Lal, Sahin Cem Geyik, Krishnaram Kenthapadi
Decision making in crucial applications such as lending, hiring, and college admissions has witnessed increasing use of algorithmic models and techniques as a result of a confluence of factors such as ubiquitous connectivity, ability to collect, aggregate, and process large amounts of fine-grained data using cloud computing, and ease of access to applying sophisticated machine learning models. Quite often, such applications are powered by search and recommendation systems, which in turn make use of personalized ranking algorithms. At the same time, there is increasing awareness about the ethical and legal challenges posed by the use of such data-driven systems. Researchers and practitioners from different disciplines have recently highlighted the potential for such systems to discriminate against certain population groups, due to biases in the datasets utilized for learning their underlying recommendation models. We present a study of fairness in online personalization settings involving the ranking of individuals. Starting from a fair warm-start machine-learned model, we first demonstrate that online personalization can cause the model to learn to act in an unfair manner if the user is biased in his/her responses. For this purpose, we construct a stylized model for generating training data with potentially biased features as well as potentially biased labels and quantify the extent of bias that is learned by the model when the user responds in a biased manner as in many real-world scenarios. We then formulate the problem of learning personalized models under fairness constraints and present a regularization based approach for mitigating biases in machine learning. We demonstrate the efficacy of our approach through extensive simulations with different parameter settings. Code: https://github.com/groshanlal/Fairness-Aware-Online-Personalization