CV AIApr 4, 2022

Capsule Networks Do Not Need to Model Everything

Riccardo Renzulli, Enzo Tartaglione, Marco Grangetto

arXiv:2204.01298v23.73 citationsh-index: 29

Originality Incremental advance

AI Analysis

This work addresses computational inefficiencies in capsule networks for computer vision applications, but it is incremental as it builds on existing routing mechanisms.

The paper tackled the problem of capsule networks requiring large sizes to model all image elements, which increases parameters and computational costs, by introducing REM (Routing Entropy Minimization) to focus only on objects of interest, reducing parse trees and parameters with negligible performance loss.

Capsule networks are biologically inspired neural networks that group neurons into vectors called capsules, each explicitly representing an object or one of its parts. The routing mechanism connects capsules in consecutive layers, forming a hierarchical structure between parts and objects, also known as a parse tree. Capsule networks often attempt to model all elements in an image, requiring large network sizes to handle complexities such as intricate backgrounds or irrelevant objects. However, this comprehensive modeling leads to increased parameter counts and computational inefficiencies. Our goal is to enable capsule networks to focus only on the object of interest, reducing the number of parse trees. We accomplish this with REM (Routing Entropy Minimization), a technique that minimizes the entropy of the parse tree-like structure. REM drives the model parameters distribution towards low entropy configurations through a pruning mechanism, significantly reducing the generation of intra-class parse trees. This empowers capsules to learn more stable and succinct representations with fewer parameters and negligible performance loss.

View on arXiv PDF

Similar