CVAILGFeb 28, 2022

Background Mixup Data Augmentation for Hand and Object-in-Contact Detection

arXiv:2202.13941v29 citations
Originality Incremental advance
AI Analysis

This addresses data bias issues in hand-object detection for video activity understanding, but it is an incremental improvement over existing Mixup methods.

The paper tackled the problem of unintended biases in hand-object detection when using Mixup data augmentation by proposing Background Mixup, which mixes target images with background images without hands or objects, resulting in reduced false positives and improved performance in supervised and semi-supervised settings.

Detecting the positions of human hands and objects-in-contact (hand-object detection) in each video frame is vital for understanding human activities from videos. For training an object detector, a method called Mixup, which overlays two training images to mitigate data bias, has been empirically shown to be effective for data augmentation. However, in hand-object detection, mixing two hand-manipulation images produces unintended biases, e.g., the concentration of hands and objects in a specific region degrades the ability of the hand-object detector to identify object boundaries. We propose a data-augmentation method called Background Mixup that leverages data-mixing regularization while reducing the unintended effects in hand-object detection. Instead of mixing two images where a hand and an object in contact appear, we mix a target training image with background images without hands and objects-in-contact extracted from external image sources, and use the mixed images for training the detector. Our experiments demonstrated that the proposed method can effectively reduce false positives and improve the performance of hand-object detection in both supervised and semi-supervised learning settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes