Mateusz Staniak

2papers

2 Papers

COMar 27, 2019
The Landscape of R Packages for Automated Exploratory Data Analysis

Mateusz Staniak, Przemyslaw Biecek

The increasing availability of large but noisy data sets with a large number of heterogeneous variables leads to the increasing interest in the automation of common tasks for data analysis. The most time-consuming part of this process is the Exploratory Data Analysis, crucial for better domain understanding, data cleaning, data validation, and feature engineering. There is a growing number of libraries that attempt to automate some of the typical Exploratory Data Analysis tasks to make the search for new insights easier and faster. In this paper, we present a systematic review of existing tools for Automated Exploratory Data Analysis (autoEDA). We explore the features of twelve popular R packages to identify the parts of analysis that can be effectively automated with the current tools and to point out new directions for further autoEDA development.

MLApr 5, 2018
Explanations of model predictions with live and breakDown packages

Mateusz Staniak, Przemyslaw Biecek

Complex models are commonly used in predictive modeling. In this paper we present R packages that can be used to explain predictions from complex black box models and attribute parts of these predictions to input features. We introduce two new approaches and corresponding packages for such attribution, namely live and breakDown. We also compare their results with existing implementations of state of the art solutions, namely lime that implements Locally Interpretable Model-agnostic Explanations and ShapleyR that implements Shapley values.