Active Learning++: Incorporating Annotator's Rationale using Local Model Explanation
This work addresses the challenge of improving data annotation efficiency in machine learning, though it is incremental as it builds on existing active learning methods.
The paper tackles the problem of active learning by incorporating annotators' rationales, such as feature importance rankings, into a Query by Committee sampling strategy, resulting in a framework that significantly outperforms a vanilla active learning approach in simulation studies.
We propose a new active learning (AL) framework, Active Learning++, which can utilize an annotator's labels as well as its rationale. Annotators can provide their rationale for choosing a label by ranking input features based on their importance for a given query. To incorporate this additional input, we modified the disagreement measure for a bagging-based Query by Committee (QBC) sampling strategy. Instead of weighing all committee models equally to select the next instance, we assign higher weight to the committee model with higher agreement with the annotator's ranking. Specifically, we generated a feature importance-based local explanation for each committee model. The similarity score between feature rankings provided by the annotator and the local model explanation is used to assign a weight to each corresponding committee model. This approach is applicable to any kind of ML model using model-agnostic techniques to generate local explanation such as LIME. With a simulation study, we show that our framework significantly outperforms a QBC based vanilla AL framework.