CVLGJul 12, 2023

YOGA: Deep Object Detection in the Wild with Lightweight Feature Learning and Multiscale Attention

arXiv:2307.05945v115 citationsh-index: 25
AI Analysis

This work addresses the need for deployable object detection on low-end edge devices, offering a flexible and efficient solution with incremental improvements in model design.

The paper tackles the problem of efficient object detection for edge devices by introducing YOGA, a lightweight model that achieves up to a 22% increase in AP and reduces parameters and FLOPs by 23-34% compared to state-of-the-art detectors.

We introduce YOGA, a deep learning based yet lightweight object detection model that can operate on low-end edge devices while still achieving competitive accuracy. The YOGA architecture consists of a two-phase feature learning pipeline with a cheap linear transformation, which learns feature maps using only half of the convolution filters required by conventional convolutional neural networks. In addition, it performs multi-scale feature fusion in its neck using an attention mechanism instead of the naive concatenation used by conventional detectors. YOGA is a flexible model that can be easily scaled up or down by several orders of magnitude to fit a broad range of hardware constraints. We evaluate YOGA on COCO-val and COCO-testdev datasets with other over 10 state-of-the-art object detectors. The results show that YOGA strikes the best trade-off between model size and accuracy (up to 22% increase of AP and 23-34% reduction of parameters and FLOPs), making it an ideal choice for deployment in the wild on low-end edge devices. This is further affirmed by our hardware implementation and evaluation on NVIDIA Jetson Nano.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes