LGMLMar 18, 2019

Advanced Capsule Networks via Context Awareness

arXiv:1903.07497v318 citations
Originality Synthesis-oriented
AI Analysis

This work addresses performance issues for Capsule Networks in specific image contexts, offering a solution for applications like sign language recognition, but it is incremental as it builds on existing CN architectures.

The authors tackled the challenge of Capsule Networks performing poorly on images with distinct contexts by expanding pooling and reconstruction layers, and they found that their improved CNs performed comparably to deep learning models on an ASL fingerspelling dataset while significantly reducing training time.

Capsule Networks (CN) offer new architectures for Deep Learning (DL) community. Though its effectiveness has been demonstrated in MNIST and smallNORB datasets, the networks still face challenges in other datasets for images with distinct contexts. In this research, we improve the design of CN (Vector version) namely we expand more Pooling layers to filter image backgrounds and increase Reconstruction layers to make better image restoration. Additionally, we perform experiments to compare accuracy and speed of CN versus DL models. In DL models, we utilize Inception V3 and DenseNet V201 for powerful computers besides NASNet, MobileNet V1 and MobileNet V2 for small and embedded devices. We evaluate our models on a fingerspelling alphabet dataset from American Sign Language (ASL). The results show that CNs perform comparably to DL models while dramatically reducing training time. We also make a demonstration and give a link for the purpose of illustration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes