LGAICVFeb 10, 2023

Element-Wise Attention Layers: an option for optimization

arXiv:2302.05488v12 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the hardware constraints for deploying attention-based models, offering a parameter-efficient alternative, though it appears incremental as it adapts existing dot-product attention.

The paper tackles the high parameter count in attention layers by proposing an element-wise attention method that reduces parameters significantly, achieving 92% accuracy with 97% fewer parameters on Fashion MNIST and 60% accuracy with 50% fewer parameters on CIFAR10 compared to a VGG-like model.

The use of Attention Layers has become a trend since the popularization of the Transformer-based models, being the key element for many state-of-the-art models that have been developed through recent years. However, one of the biggest obstacles in implementing these architectures - as well as many others in Deep Learning Field - is the enormous amount of optimizing parameters they possess, which make its use conditioned on the availability of robust hardware. In this paper, it's proposed a new method of attention mechanism that adapts the Dot-Product Attention, which uses matrices multiplications, to become element-wise through the use of arrays multiplications. To test the effectiveness of such approach, two models (one with a VGG-like architecture and one with the proposed method) have been trained in a classification task using Fashion MNIST and CIFAR10 datasets. Each model has been trained for 10 epochs in a single Tesla T4 GPU from Google Colaboratory. The results show that this mechanism allows for an accuracy of 92% of the VGG-like counterpart in Fashion MNIST dataset, while reducing the number of parameters in 97%. For CIFAR10, the accuracy is still equivalent to 60% of the VGG-like counterpart while using 50% less parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes