LGJun 3, 2022

Canonical convolutional neural networks

arXiv:2206.01509v14 citationsh-index: 52
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency and compression in neural networks for machine learning practitioners, but it is incremental as it builds on existing normalization and decomposition techniques.

The authors tackled the problem of improving convolutional neural networks by introducing canonical weight normalization, which expresses weight tensors as scaled sums of outer vector products, leading to competitive performance on MNIST, CIFAR10, and SVHN datasets and enabling convenient model compression through parameter truncation.

We introduce canonical weight normalization for convolutional neural networks. Inspired by the canonical tensor decomposition, we express the weight tensors in so-called canonical networks as scaled sums of outer vector products. In particular, we train network weights in the decomposed form, where scale weights are optimized separately for each mode. Additionally, similarly to weight normalization, we include a global scaling parameter. We study the initialization of the canonical form by running the power method and by drawing randomly from Gaussian or uniform distributions. Our results indicate that we can replace the power method with cheaper initializations drawn from standard distributions. The canonical re-parametrization leads to competitive normalization performance on the MNIST, CIFAR10, and SVHN data sets. Moreover, the formulation simplifies network compression. Once training has converged, the canonical form allows convenient model-compression by truncating the parameter sums.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes