MLLGFeb 24, 2017

Changing Model Behavior at Test-Time Using Reinforcement Learning

arXiv:1702.07780v157 citations
Originality Synthesis-oriented
AI Analysis

This addresses resource management for models on devices like embedded systems or cell phones, but it is incremental as it builds on existing mixture-of-experts and RL techniques.

The authors tackled the problem of adapting machine learning models to meet test-time constraints like real-time inference or power efficiency by proposing a mixture-of-experts model adjusted per-input using reinforcement learning, and they tested it on a small MNIST-based example with no concrete performance numbers reported.

Machine learning models are often used at test-time subject to constraints and trade-offs not present at training-time. For example, a computer vision model operating on an embedded device may need to perform real-time inference, or a translation model operating on a cell phone may wish to bound its average compute time in order to be power-efficient. In this work we describe a mixture-of-experts model and show how to change its test-time resource-usage on a per-input basis using reinforcement learning. We test our method on a small MNIST-based example.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes