Structured Transforms for Small-Footprint Deep Learning
This work addresses the need for compact and efficient deep learning models for mobile devices, offering a novel method that is incremental in improving tradeoffs between accuracy, size, and speed.
The paper tackles the problem of deploying deep learning on mobile devices with limited storage and power by proposing a framework for learning structured parameter matrices with low displacement rank, achieving over 3.5-fold compression while nearly matching state-of-the-art performance in keyword spotting applications.
We consider the task of building compact deep learning pipelines suitable for deployment on storage and power constrained mobile devices. We propose a unified framework to learn a broad family of structured parameter matrices that are characterized by the notion of low displacement rank. Our structured transforms admit fast function and gradient evaluation, and span a rich range of parameter sharing configurations whose statistical modeling capacity can be explicitly tuned along a continuum from structured to unstructured. Experimental results show that these transforms can significantly accelerate inference and forward/backward passes during training, and offer superior accuracy-compactness-speed tradeoffs in comparison to a number of existing techniques. In keyword spotting applications in mobile speech recognition, our methods are much more effective than standard linear low-rank bottleneck layers and nearly retain the performance of state of the art models, while providing more than 3.5-fold compression.