CVOct 2, 2018

Training compact deep learning models for video classification using circulant matrices

arXiv:1810.01140v215 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient deep learning models for deployment on resource-constrained devices like mobile platforms, though it is incremental as it builds on existing network architectures.

The paper tackles the problem of reducing model size for video classification by imposing circulant matrix structure on weight matrices, achieving a compact DBoF embedding that balances size and accuracy on the YouTube-8M dataset.

In real world scenarios, model accuracy is hardly the only factor to consider. Large models consume more memory and are computationally more intensive, which makes them difficult to train and to deploy, especially on mobile devices. In this paper, we build on recent results at the crossroads of Linear Algebra and Deep Learning which demonstrate how imposing a structure on large weight matrices can be used to reduce the size of the model. We propose very compact models for video classification based on state-of-the-art network architectures such as Deep Bag-of-Frames, NetVLAD and NetFisherVectors. We then conduct thorough experiments using the large YouTube-8M video classification dataset. As we will show, the circulant DBoF embedding achieves an excellent trade-off between size and accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes