LGCVMar 1, 2023

Speeding Up EfficientNet: Selecting Update Blocks of Convolutional Neural Networks using Genetic Algorithm in Transfer Learning

arXiv:2303.00261v16 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the challenge for practitioners who lack expert knowledge in CNN architecture by automating layer selection in transfer learning, though it is incremental as it builds on existing methods like genetic algorithms and EfficientNet.

The paper tackles the problem of selecting which blocks of layers to update in transfer learning for convolutional neural networks, using a genetic algorithm to automate this selection. The result is similar or better accuracy than baseline methods while reducing training and evaluation time by learning fewer parameters, as demonstrated on datasets like Food-101, CIFAR-100, and MangoLeafBD.

The performance of convolutional neural networks (CNN) depends heavily on their architectures. Transfer learning performance of a CNN relies quite strongly on selection of its trainable layers. Selecting the most effective update layers for a certain target dataset often requires expert knowledge on CNN architecture which many practitioners do not posses. General users prefer to use an available architecture (e.g. GoogleNet, ResNet, EfficientNet etc.) that is developed by domain experts. With the ever-growing number of layers, it is increasingly becoming quite difficult and cumbersome to handpick the update layers. Therefore, in this paper we explore the application of genetic algorithm to mitigate this problem. The convolutional layers of popular pretrained networks are often grouped into modules that constitute their building blocks. We devise a genetic algorithm to select blocks of layers for updating the parameters. By experimenting with EfficientNetB0 pre-trained on ImageNet and using Food-101, CIFAR-100 and MangoLeafBD as target datasets, we show that our algorithm yields similar or better results than the baseline in terms of accuracy, and requires lower training and evaluation time due to learning less number of parameters. We also devise a metric called block importance to measure efficacy of each block as update block and analyze the importance of the blocks selected by our algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes