Reducing the dilution: An analysis of the information sensitiveness of capsule network with a practical improvement method
This work addresses a specific bottleneck in capsule networks for visual tasks, offering an incremental improvement to enhance their competitiveness with CNNs.
The paper tackled the performance gap between capsule networks and CNNs on datasets with background and complex objects by analyzing the conflict between capsule information sensitiveness and activation value distribution, and proposed a method to restrain activation values in the primary capsule layer. The method achieved better performances on various mainstream datasets and serves as a simple, efficient regularization technique for capsule networks.
Capsule network has shown various advantages over convolutional neural network (CNN). It keeps more precise spatial information than CNN and uses equivariance instead of invariance during inference and highly potential to be a new effective tool for visual tasks. However, the current capsule networks have incompatible performance with CNN when facing datasets with background and complex target objects and are lacking in universal and efficient regularization method. We analyze a main reason of the incompatible performance as the conflict between information sensitiveness of capsule network and unreasonably higher activation value distribution of capsules in primary capsule layer. Correspondingly, we propose a practical improvement method by restraining the activation value of capsules in primary capsule layer to suppress non-informative capsules and highlight discriminative capsules. In the experiments, the method has achieved better performances on various mainstream datasets. In addition, the proposed improvement methods can be seen as a suitable, simple and efficient regularization method that can be generally used in capsule network.