LGNEAug 8, 2016

Online Adaptation of Deep Architectures with Reinforcement Learning

arXiv:1608.02292v15 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of covariate shift in online learning for machine learning practitioners, though it appears incremental as it builds on existing reinforcement learning and autoencoder techniques.

The paper tackles the problem of adapting deep architectures to changing data distributions in online learning by proposing a reinforcement learning method to modify a stacked Denoising Autoencoder's structure, resulting in improved responsiveness and robustness compared to counterparts, with better preservation of prior knowledge.

Online learning has become crucial to many problems in machine learning. As more data is collected sequentially, quickly adapting to changes in the data distribution can offer several competitive advantages such as avoiding loss of prior knowledge and more efficient learning. However, adaptation to changes in the data distribution (also known as covariate shift) needs to be performed without compromising past knowledge already built in into the model to cope with voluminous and dynamic data. In this paper, we propose an online stacked Denoising Autoencoder whose structure is adapted through reinforcement learning. Our algorithm forces the network to exploit and explore favourable architectures employing an estimated utility function that maximises the accuracy of an unseen validation sequence. Different actions, such as Pool, Increment and Merge are available to modify the structure of the network. As we observe through a series of experiments, our approach is more responsive, robust, and principled than its counterparts for non-stationary as well as stationary data distributions. Experimental results indicate that our algorithm performs better at preserving gained prior knowledge and responding to changes in the data distribution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes