CVSep 11, 2023

An Effective Two-stage Training Paradigm Detector for Small Dataset

Zheng Wang, Dong Xie, Hanzhi Wang, Jiang Tian

arXiv:2309.05652v11.51 citationsh-index: 2

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited labeled data for object detection, but it is incremental as it builds on existing methods like YOLOv8 and masked image modeling.

The paper tackles object detection on small datasets by proposing a two-stage training paradigm for YOLOv8, achieving 30.4% average precision on the DelftBikes test set and ranking 4th in a challenge.

Learning from the limited amount of labeled data to the pre-train model has always been viewed as a challenging task. In this report, an effective and robust solution, the two-stage training paradigm YOLOv8 detector (TP-YOLOv8), is designed for the object detection track in VIPriors Challenge 2023. First, the backbone of YOLOv8 is pre-trained as the encoder using the masked image modeling technique. Then the detector is fine-tuned with elaborate augmentations. During the test stage, test-time augmentation (TTA) is used to enhance each model, and weighted box fusion (WBF) is implemented to further boost the performance. With the well-designed structure, our approach has achieved 30.4% average precision from 0.50 to 0.95 on the DelftBikes test set, ranking 4th on the leaderboard.

View on arXiv PDF

Similar