Point Cloud Network: An Order of Magnitude Improvement in Linear Layer Parameter Count
This addresses the efficiency issue for deep learning practitioners by offering a more parameter-efficient alternative to MLPs, though it appears incremental as it focuses on optimizing existing linear layers.
The paper tackles the problem of high parameter counts in linear layers of deep learning networks by introducing the Point Cloud Network (PCN) architecture, achieving a 99.5% reduction in parameters while maintaining comparable test accuracy on CIFAR-10 and CIFAR-100 datasets.
This paper introduces the Point Cloud Network (PCN) architecture, a novel implementation of linear layers in deep learning networks, and provides empirical evidence to advocate for its preference over the Multilayer Perceptron (MLP) in linear layers. We train several models, including the original AlexNet, using both MLP and PCN architectures for direct comparison of linear layers (Krizhevsky et al., 2012). The key results collected are model parameter count and top-1 test accuracy over the CIFAR-10 and CIFAR-100 datasets (Krizhevsky, 2009). AlexNet-PCN16, our PCN equivalent to AlexNet, achieves comparable efficacy (test accuracy) to the original architecture with a 99.5% reduction of parameters in its linear layers. All training is done on cloud RTX 4090 GPUs, leveraging pytorch for model construction and training. Code is provided for anyone to reproduce the trials from this paper.