CVLGNEMLMay 24, 2017

Dense Transformer Networks

arXiv:1705.08881v217 citations
Originality Incremental advance
AI Analysis

This work addresses a specific problem in computer vision for researchers and practitioners by improving segmentation accuracy through learned patches, but it is incremental as it builds on existing encoder-decoder architectures.

The authors tackled the limitation of fixed patch sizes in deep learning for dense prediction by proposing dense transformer networks that learn patch shapes and sizes from data, achieving superior performance in natural and biological image segmentation tasks compared to baseline methods.

The key idea of current deep learning methods for dense prediction is to apply a model on a regular patch centered on each pixel to make pixel-wise predictions. These methods are limited in the sense that the patches are determined by network architecture instead of learned from data. In this work, we propose the dense transformer networks, which can learn the shapes and sizes of patches from data. The dense transformer networks employ an encoder-decoder architecture, and a pair of dense transformer modules are inserted into each of the encoder and decoder paths. The novelty of this work is that we provide technical solutions for learning the shapes and sizes of patches from data and efficiently restoring the spatial correspondence required for dense prediction. The proposed dense transformer modules are differentiable, thus the entire network can be trained. We apply the proposed networks on natural and biological image segmentation tasks and show superior performance is achieved in comparison to baseline methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes