CL LGAug 15, 2021

Deep Active Learning for Text Classification with Diverse Interpretations

Qiang Liu, Yanqiao Zhu, Zhaocheng Liu, Yufeng Zhang, Shu Wu

arXiv:2108.10687v11.816 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of reducing annotation costs in text classification for machine learning practitioners, representing an incremental improvement in active learning methods.

The paper tackles the challenge of measuring sample informativeness in deep active learning for text classification by proposing ALDEN, which uses diverse local interpretations to select samples, resulting in consistent outperformance over state-of-the-art methods.

Recently, Deep Neural Networks (DNNs) have made remarkable progress for text classification, which, however, still require a large number of labeled data. To train high-performing models with the minimal annotation cost, active learning is proposed to select and label the most informative samples, yet it is still challenging to measure informativeness of samples used in DNNs. In this paper, inspired by piece-wise linear interpretability of DNNs, we propose a novel Active Learning with DivErse iNterpretations (ALDEN) approach. With local interpretations in DNNs, ALDEN identifies linearly separable regions of samples. Then, it selects samples according to their diversity of local interpretations and queries their labels. To tackle the text classification problem, we choose the word with the most diverse interpretations to represent the whole sentence. Extensive experiments demonstrate that ALDEN consistently outperforms several state-of-the-art deep active learning methods.

View on arXiv PDF

Similar