CVApr 7, 2022

Total Variation Optimization Layers for Computer Vision

Raymond A. Yeh, Yuan-Ting Hu, Zhongzheng Ren, Alexander G. Schwing

arXiv:2204.03643v110.118 citationsh-index: 67Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of integrating efficient optimization layers into deep networks for computer vision, offering incremental improvements across multiple tasks.

The authors tackled the problem of designing optimization layers for deep networks in computer vision by proposing total variation minimization as a layer, which improved performance on five tasks including image classification and denoising, with a GPU-based method that is 37 times faster than existing solutions.

Optimization within a layer of a deep-net has emerged as a new direction for deep-net layer design. However, there are two main challenges when applying these layers to computer vision tasks: (a) which optimization problem within a layer is useful?; (b) how to ensure that computation within a layer remains efficient? To study question (a), in this work, we propose total variation (TV) minimization as a layer for computer vision. Motivated by the success of total variation in image processing, we hypothesize that TV as a layer provides useful inductive bias for deep-nets too. We study this hypothesis on five computer vision tasks: image classification, weakly supervised object localization, edge-preserving smoothing, edge detection, and image denoising, improving over existing baselines. To achieve these results we had to address question (b): we developed a GPU-based projected-Newton method which is $37\times$ faster than existing solutions.

View on arXiv PDF Code

Similar