CVAIJun 1, 2023

Dilated Convolution with Learnable Spacings: beyond bilinear interpolation

arXiv:2306.00817v25 citationsh-index: 36Has Code
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for computer vision researchers, offering a parameter-free enhancement to convolutional neural networks.

The paper tackles the problem of improving dilated convolution performance by extending learnable spacings beyond bilinear interpolation, showing that longer-range interpolations like Gaussian interpolation enhance ImageNet1k classification accuracy on ConvNeXt and Conv-Former architectures without adding parameters.

Dilated Convolution with Learnable Spacings (DCLS) is a recently proposed variation of the dilated convolution in which the spacings between the non-zero elements in the kernel, or equivalently their positions, are learnable. Non-integer positions are handled via interpolation. Thanks to this trick, positions have well-defined gradients. The original DCLS used bilinear interpolation, and thus only considered the four nearest pixels. Yet here we show that longer range interpolations, and in particular a Gaussian interpolation, allow improving performance on ImageNet1k classification on two state-of-the-art convolutional architectures (ConvNeXt and Conv\-Former), without increasing the number of parameters. The method code is based on PyTorch and is available at https://github.com/K-H-Ismail/Dilated-Convolution-with-Learnable-Spacings-PyTorch

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes