LGCVMLJul 27, 2020

ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks

arXiv:2007.13384v12 citations
AI Analysis

This addresses the challenge of efficient deep learning deployment for embedded applications, though it appears incremental as it builds on existing model compression methods.

The paper tackled the problem of deploying convolutional neural networks on resource-constrained embedded hardware by proposing ALF, an autoencoder-based low-rank filter-sharing technique, which achieved a 70% reduction in parameters, 61% in operations, and 41% in execution time with minimal accuracy loss.

Closing the gap between the hardware requirements of state-of-the-art convolutional neural networks and the limited resources constraining embedded applications is the next big challenge in deep learning research. The computational complexity and memory footprint of such neural networks are typically daunting for deployment in resource constrained environments. Model compression techniques, such as pruning, are emphasized among other optimization methods for solving this problem. Most existing techniques require domain expertise or result in irregular sparse representations, which increase the burden of deploying deep learning applications on embedded hardware accelerators. In this paper, we propose the autoencoder-based low-rank filter-sharing technique technique (ALF). When applied to various networks, ALF is compared to state-of-the-art pruning methods, demonstrating its efficient compression capabilities on theoretical metrics as well as on an accurate, deterministic hardware-model. In our experiments, ALF showed a reduction of 70\% in network parameters, 61\% in operations and 41\% in execution time, with minimal loss in accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes