AILGApr 17, 2020

DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications

arXiv:2004.08366v11 citations
AI Analysis

This work addresses the problem of scaling deep learning models with sparse features for applications like online advertising, enabling more flexible and efficient systems, though it builds incrementally on existing frameworks like TensorFlow.

The paper tackles the limitation of deep learning models with sparse features requiring predefined dictionaries by proposing a theory and system design that decouples content from form, enabling models to handle arbitrary numbers of distinct sparse features and grow without redefinition. It shows that the resulting models perform better and run efficiently at a larger scale, with a production model achieving significant accuracy gains in suggesting keywords for Google Smart Campaigns.

One of the limitations of deep learning models with sparse features today stems from the predefined nature of their input, which requires a dictionary be defined prior to the training. With this paper we propose both a theory and a working system design which remove this limitation, and show that the resulting models are able to perform better and efficiently run at a much larger scale. Specifically, we achieve this by decoupling a model's content from its form to tackle architecture evolution and memory growth separately. To efficiently handle model growth, we propose a new neuron model, called DynamicCell, drawing inspiration from from the free energy principle [15] to introduce the concept of reaction to discharge non-digestive energy, which also subsumes gradient descent based approaches as its special cases. We implement DynamicCell by introducing a new server into TensorFlow to take over most of the work involving model growth. Consequently, it enables any existing deep learning models to efficiently handle arbitrary number of distinct sparse features (e.g., search queries), and grow incessantly without redefining the model. Most notably, one of our models, which has been reliably running in production for over a year, is capable of suggesting high quality keywords for advertisers of Google Smart Campaigns and achieved significant accuracy gains based on a challenging metric -- evidence that data-driven, self-evolving systems can potentially exceed the performance of traditional rule-based approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes