Evaluation of Interactive Machine Learning Systems
This work addresses the problem of evaluating co-adaptive human-machine systems for researchers and practitioners in interactive ML, but it is incremental as it builds on existing validation methods.
The paper tackles the challenge of evaluating interactive machine learning systems by proposing a dual validation approach combining algorithm-centered analysis and human-centered evaluation, using a visual analytics application as an example to illustrate how this addresses the 'black-box' effect.
The evaluation of interactive machine learning systems remains a difficult task. These systems learn from and adapt to the human, but at the same time, the human receives feedback and adapts to the system. Getting a clear understanding of these subtle mechanisms of co-operation and co-adaptation is challenging. In this chapter, we report on our experience in designing and evaluating various interactive machine learning applications from different domains. We argue for coupling two types of validation: algorithm-centered analysis, to study the computational behaviour of the system; and human-centered evaluation, to observe the utility and effectiveness of the application for end-users. We use a visual analytics application for guided search, built using an interactive evolutionary approach, as an exemplar of our work. Our observation is that human-centered design and evaluation complement algorithmic analysis, and can play an important role in addressing the "black-box" effect of machine learning. Finally, we discuss research opportunities that require human-computer interaction methodologies, in order to support both the visible and hidden roles that humans play in interactive machine learning.