LGMSMLFeb 25, 2014

Machine Learning at Scale

arXiv:1402.6076v19 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of scaling machine learning for digital advertising campaigns, but it is incremental as it focuses on implementation details rather than novel algorithmic breakthroughs.

The paper tackles the challenge of building thousands of predictive models for digital advertising at scale, processing hundreds of terabytes of data in a chaotic real-time environment, and presents an automated platform that impacts billions of monthly advertising impressions.

It takes skill to build a meaningful predictive model even with the abundance of implementations of modern machine learning algorithms and readily available computing resources. Building a model becomes challenging if hundreds of terabytes of data need to be processed to produce the training data set. In a digital advertising technology setting, we are faced with the need to build thousands of such models that predict user behavior and power advertising campaigns in a 24/7 chaotic real-time production environment. As data scientists, we also have to convince other internal departments critical to implementation success, our management, and our customers that our machine learning system works. In this paper, we present the details of the design and implementation of an automated, robust machine learning platform that impacts billions of advertising impressions monthly. This platform enables us to continuously optimize thousands of campaigns over hundreds of millions of users, on multiple continents, against varying performance objectives.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes