Trust in AutoML: Exploring Information Needs for Establishing Trust in Automated Machine Learning Systems
This addresses trust issues for data scientists using AutoML, but it is incremental as it builds on existing human-computer interaction research.
The paper investigated what information influences data scientists' trust in AutoML systems, finding that transparency features like performance metrics and visualizations increased trust and understandability.
We explore trust in a relatively new area of data science: Automated Machine Learning (AutoML). In AutoML, AI methods are used to generate and optimize machine learning models by automatically engineering features, selecting models, and optimizing hyperparameters. In this paper, we seek to understand what kinds of information influence data scientists' trust in the models produced by AutoML? We operationalize trust as a willingness to deploy a model produced using automated methods. We report results from three studies -- qualitative interviews, a controlled experiment, and a card-sorting task -- to understand the information needs of data scientists for establishing trust in AutoML systems. We find that including transparency features in an AutoML tool increased user trust and understandability in the tool; and out of all proposed features, model performance metrics and visualizations are the most important information to data scientists when establishing their trust with an AutoML tool.