LGDCJun 13, 2016

Making Contextual Decisions with Low Technical Debt

arXiv:1606.03966v251 citations
Originality Incremental advance
AI Analysis

This system reduces technical debt for practitioners using contextual bandit learning in applications like content recommendation and tech support, though it is incremental as it builds on existing reinforcement learning methods.

The paper tackles the problem of applying contextual bandit algorithms in practice by creating the Decision Service, a general system that addresses technical debt issues like incorrect data collection and weak debuggability, resulting in live production deployments with click-through improvements of 25-30% and an 18% revenue lift.

Applications and systems are constantly faced with decisions that require picking from a set of actions based on contextual information. Reinforcement-based learning algorithms such as contextual bandits can be very effective in these settings, but applying them in practice is fraught with technical debt, and no general system exists that supports them completely. We address this and create the first general system for contextual learning, called the Decision Service. Existing systems often suffer from technical debt that arises from issues like incorrect data collection and weak debuggability, issues we systematically address through our ML methodology and system abstractions. The Decision Service enables all aspects of contextual bandit learning using four system abstractions which connect together in a loop: explore (the decision space), log, learn, and deploy. Notably, our new explore and log abstractions ensure the system produces correct, unbiased data, which our learner uses for online learning and to enable real-time safeguards, all in a fully reproducible manner. The Decision Service has a simple user interface and works with a variety of applications: we present two live production deployments for content recommendation that achieved click-through improvements of 25-30%, another with 18% revenue lift in the landing page, and ongoing applications in tech support and machine failure handling. The service makes real-time decisions and learns continuously and scalably, while significantly lowering technical debt.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes