LGMLOct 23, 2020

Online Semi-Supervised Learning with Bandit Feedback

arXiv:2010.12574v17 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of online decision-making with limited labeled data for domains such as healthcare and advertising, though it appears incremental.

The paper tackles the problem of combining semi-supervised learning with contextual bandits for applications like clinical trials and ad recommendations, resulting in algorithms verified on real-world datasets.

We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits,motivated by several applications including clini-cal trials and ad recommendations. We demonstratehow Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted tothe new problem formulation. We also propose avariant of the linear contextual bandit with semi-supervised missing rewards imputation. We thentake the best of both approaches to develop multi-GCN embedded contextual bandit. Our algorithmsare verified on several real world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes