Irrelevant Pixels are Everywhere: Find and Exclude Them for More Efficient Computer Vision
This addresses the problem of high computational cost for computer vision on power-constrained systems like mobile and IoT devices, offering a significant efficiency improvement.
The paper tackles the inefficiency of CNNs by identifying that 48% of pixels in images are irrelevant to tasks like car detection, and proposes a method to exclude these pixels, resulting in no accuracy loss while reducing inference latency, energy consumption, and multiply-add count by about 45% on embedded devices.
Computer vision is often performed using Convolutional Neural Networks (CNNs). CNNs are compute-intensive and challenging to deploy on power-contrained systems such as mobile and Internet-of-Things (IoT) devices. CNNs are compute-intensive because they indiscriminately compute many features on all pixels of the input image. We observe that, given a computer vision task, images often contain pixels that are irrelevant to the task. For example, if the task is looking for cars, pixels in the sky are not very useful. Therefore, we propose that a CNN be modified to only operate on relevant pixels to save computation and energy. We propose a method to study three popular computer vision datasets, finding that 48% of pixels are irrelevant. We also propose the focused convolution to modify a CNN's convolutional layers to reject the pixels that are marked irrelevant. On an embedded device, we observe no loss in accuracy, while inference latency, energy consumption, and multiply-add count are all reduced by about 45%.