CV LGJan 24, 2025

Rethinking Encoder-Decoder Flow Through Shared Structures

Frederik Laboyrie, Mehmet Kerim Yucel, Albert Saa-Garriga

arXiv:2501.14535v13.6h-index: 10ICASSP

Originality Incremental advance

AI Analysis

This addresses a bottleneck in dense prediction for computer vision researchers, offering an incremental improvement over existing methods.

The paper tackled the problem of limited decoder innovation in dense prediction tasks by introducing shared structures called banks, which improved depth estimation performance on state-of-the-art transformer-based architectures for natural and synthetic images.

Dense prediction tasks have enjoyed a growing complexity of encoder architectures, decoders, however, have remained largely the same. They rely on individual blocks decoding intermediate feature maps sequentially. We introduce banks, shared structures that are used by each decoding block to provide additional context in the decoding process. These structures, through applying them via resampling and feature fusion, improve performance on depth estimation for state-of-the-art transformer-based architectures on natural and synthetic images whilst training on large-scale datasets.

View on arXiv PDF

Similar