Revisiting Binary Local Image Description for Resource Limited Devices
This work addresses the problem of enabling efficient image description on resource-limited devices, representing an incremental improvement in optimizing accuracy-resource trade-offs.
The paper tackled the challenge of designing efficient computer vision algorithms for resource-limited devices by introducing new binary image descriptors, BAD and HashSIFT, which optimize the trade-off between accuracy and computational resources. BAD achieved the fastest implementation in the literature, while HashSIFT approached the accuracy of top deep learning-based descriptors with greater computational efficiency.
The advent of a panoply of resource limited devices opens up new challenges in the design of computer vision algorithms with a clear compromise between accuracy and computational requirements. In this paper we present new binary image descriptors that emerge from the application of triplet ranking loss, hard negative mining and anchor swapping to traditional features based on pixel differences and image gradients. These descriptors, BAD (Box Average Difference) and HashSIFT, establish new operating points in the state-of-the-art's accuracy vs.\ resources trade-off curve. In our experiments we evaluate the accuracy, execution time and energy consumption of the proposed descriptors. We show that BAD bears the fastest descriptor implementation in the literature while HashSIFT approaches in accuracy that of the top deep learning-based descriptors, being computationally more efficient. We have made the source code public.