CVNEJun 21, 2016

Tagger: Deep Unsupervised Perceptual Grouping

arXiv:1606.06724v2169 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient perceptual inference for multi-modal data, offering an incremental improvement over existing segmentation and classification methods.

The researchers tackled the problem of perceptual grouping in multi-object scenes by developing an unsupervised framework that learns segmentation alongside any task, achieving improved classification performance on cluttered multi-digit images over convolutional networks and better sample efficiency than baseline methods.

We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features. Rather than being trained for any specific segmentation, our framework learns the grouping process in an unsupervised manner or alongside any supervised task. By enriching the representations of a neural network, we enable it to group the representations of different objects in an iterative manner. By allowing the system to amortize the iterative inference of the groupings, we achieve very fast convergence. In contrast to many other recently proposed methods for addressing multi-object scenes, our system does not assume the inputs to be images and can therefore directly handle other modalities. For multi-digit classification of very cluttered images that require texture segmentation, our method offers improved classification performance over convolutional networks despite being fully connected. Furthermore, we observe that our system greatly improves on the semi-supervised result of a baseline Ladder network on our dataset, indicating that segmentation can also improve sample efficiency.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes