CVOct 26, 2017

Dynamic Routing Between Capsules

arXiv:1710.09829v25104 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of robust object recognition in computer vision, particularly for overlapping objects, representing a novel paradigm rather than an incremental improvement.

The paper tackles the problem of recognizing overlapping digits by introducing a capsule system with dynamic routing, achieving state-of-the-art performance on MNIST and outperforming convolutional nets in handling high overlap.

A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters. Active capsules at one level make predictions, via transformation matrices, for the instantiation parameters of higher-level capsules. When multiple predictions agree, a higher level capsule becomes active. We show that a discrimininatively trained, multi-layer capsule system achieves state-of-the-art performance on MNIST and is considerably better than a convolutional net at recognizing highly overlapping digits. To achieve these results we use an iterative routing-by-agreement mechanism: A lower-level capsule prefers to send its output to higher level capsules whose activity vectors have a big scalar product with the prediction coming from the lower-level capsule.

Code Implementations77 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes