LGAISep 11, 2022

Patching Weak Convolutional Neural Network Models through Modularization and Composition

arXiv:2209.06116v314 citationsh-index: 22
Originality Incremental advance
AI Analysis

This provides a modular solution for improving robustness in CNN models, particularly for classification tasks where specific classes underperform, but it is incremental as it builds on existing software engineering concepts.

The paper tackles the problem of patching weak parts of convolutional neural network (CNN) models for classification without costly retraining, by proposing CNNSplitter, which decomposes a strong model into modules and composes them with weak models. Experimental results show improvements of 12.54% in precision and 2.14% in recall for target classes, and 1.18% accuracy gain for non-target classes.

Despite great success in many applications, deep neural networks are not always robust in practice. For instance, a convolutional neuron network (CNN) model for classification tasks often performs unsatisfactorily in classifying some particular classes of objects. In this work, we are concerned with patching the weak part of a CNN model instead of improving it through the costly retraining of the entire model. Inspired by the fundamental concepts of modularization and composition in software engineering, we propose a compressed modularization approach, CNNSplitter, which decomposes a strong CNN model for $N$-class classification into $N$ smaller CNN modules. Each module is a sub-model containing a part of the convolution kernels of the strong model. To patch a weak CNN model that performs unsatisfactorily on a target class (TC), we compose the weak CNN model with the corresponding module obtained from a strong CNN model. The ability of the weak CNN model to recognize the TC can thus be improved through patching. Moreover, the ability to recognize non-TCs is also improved, as the samples misclassified as TC could be classified as non-TCs correctly. Experimental results with two representative CNNs on three widely-used datasets show that the averaged improvement on the TC in terms of precision and recall are 12.54% and 2.14%, respectively. Moreover, patching improves the accuracy of non-TCs by 1.18%. The results demonstrate that CNNSplitter can patch a weak CNN model through modularization and composition, thus providing a new solution for developing robust CNN models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes