CVMar 1, 2020

Learning When and Where to Zoom with Deep Reinforcement Learning

arXiv:2003.00425v280 citations
AI Analysis

This addresses the cost-efficiency problem for applications like remote sensing where high-resolution data is expensive to process and acquire, though it appears incremental as it builds on existing reinforcement learning techniques for selective data usage.

The paper tackles the problem of reducing computational and acquisition costs of high-resolution images by proposing PatchDrop, a reinforcement learning method that dynamically selects when and where to use high-resolution data based on low-resolution inputs. The result shows that it maintains similar accuracy while using significantly less high-resolution data on datasets like CIFAR10, CIFAR100, ImageNet, and fMoW.

While high resolution images contain semantically more useful information than their lower resolution counterparts, processing them is computationally more expensive, and in some applications, e.g. remote sensing, they can be much more expensive to acquire. For these reasons, it is desirable to develop an automatic method to selectively use high resolution data when necessary while maintaining accuracy and reducing acquisition/run-time cost. In this direction, we propose PatchDrop a reinforcement learning approach to dynamically identify when and where to use/acquire high resolution data conditioned on the paired, cheap, low resolution images. We conduct experiments on CIFAR10, CIFAR100, ImageNet and fMoW datasets where we use significantly less high resolution data while maintaining similar accuracy to models which use full high resolution images.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes