CLApr 17, 2018

Reinforced Co-Training

arXiv:1804.06035v11120 citations
Originality Incremental advance
AI Analysis

This addresses the issue of inefficient unlabeled data utilization in semi-supervised learning for text classification, though it is an incremental improvement over existing co-training methods.

The paper tackles the problem of sample selection bias in co-training for semi-supervised learning by proposing Reinforced Co-Training, which uses Q-learning to learn a data selection policy, resulting in more accurate text classification on clickbait detection and generic tasks.

Co-training is a popular semi-supervised learning framework to utilize a large amount of unlabeled data in addition to a small labeled set. Co-training methods exploit predicted labels on the unlabeled data and select samples based on prediction confidence to augment the training. However, the selection of samples in existing co-training methods is based on a predetermined policy, which ignores the sampling bias between the unlabeled and the labeled subsets, and fails to explore the data space. In this paper, we propose a novel method, Reinforced Co-Training, to select high-quality unlabeled samples to better co-train on. More specifically, our approach uses Q-learning to learn a data selection policy with a small labeled dataset, and then exploits this policy to train the co-training classifiers automatically. Experimental results on clickbait detection and generic text classification tasks demonstrate that our proposed method can obtain more accurate text classification results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes