LGAIIRMay 18, 2012

Online Structured Prediction via Coactive Learning

arXiv:1205.4213v266 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of structured prediction in online systems like search engines and recommendations, where user feedback is imperfect, offering a practical incremental improvement over conventional methods.

The paper tackles the problem of interactive learning with non-optimal user feedback, such as clicks in web search, by proposing Coactive Learning, which achieves an average regret of O(1/√T) without observing cardinal utility values.

We propose Coactive Learning as a model of interaction between a learning system and a human user, where both have the common goal of providing results of maximum utility to the user. At each step, the system (e.g. search engine) receives a context (e.g. query) and predicts an object (e.g. ranking). The user responds by correcting the system if necessary, providing a slightly improved -- but not necessarily optimal -- object as feedback. We argue that such feedback can often be inferred from observable user behavior, for example, from clicks in web-search. Evaluating predictions by their cardinal utility to the user, we propose efficient learning algorithms that have ${\cal O}(\frac{1}{\sqrt{T}})$ average regret, even though the learning algorithm never observes cardinal utility values as in conventional online learning. We demonstrate the applicability of our model and learning algorithms on a movie recommendation task, as well as ranking for web-search.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes