LGMar 15, 2023

Gated Compression Layers for Efficient Always-On Models

arXiv:2303.08970v14 citationsh-index: 54
AI Analysis

This addresses the problem of efficient always-on models for mobile and embedded developers, offering a novel solution that balances power and accuracy without being purely incremental.

The paper tackles the trade-off between accuracy and power consumption in on-device machine learning by introducing Gated Compression layers, which transform existing neural networks into Gated Neural Networks, achieving up to 96% negative sample stoppage and 97% positive sample compression while maintaining or improving accuracy across five datasets.

Mobile and embedded machine learning developers frequently have to compromise between two inferior on-device deployment strategies: sacrifice accuracy and aggressively shrink their models to run on dedicated low-power cores; or sacrifice battery by running larger models on more powerful compute cores such as neural processing units or the main application processor. In this paper, we propose a novel Gated Compression layer that can be applied to transform existing neural network architectures into Gated Neural Networks. Gated Neural Networks have multiple properties that excel for on-device use cases that help significantly reduce power, boost accuracy, and take advantage of heterogeneous compute cores. We provide results across five public image and audio datasets that demonstrate the proposed Gated Compression layer effectively stops up to 96% of negative samples, compresses 97% of positive samples, while maintaining or improving model accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes