CVApr 4, 2024

Multi Positive Contrastive Learning with Pose-Consistent Generated Images

arXiv:2404.03256v11 citationsh-index: 7
Originality Highly original
AI Analysis

This addresses a bottleneck in practical tasks like human pose estimation for computer vision applications, offering a more data-efficient approach.

The paper tackles the problem of pre-training for human pose estimation by generating visually distinct images with identical poses and using multi-positive contrastive learning, achieving superior performance in human-centric perception tasks with less than 1% of the data compared to state-of-the-art methods.

Model pre-training has become essential in various recognition tasks. Meanwhile, with the remarkable advancements in image generation models, pre-training methods utilizing generated images have also emerged given their ability to produce unlimited training data. However, while existing methods utilizing generated images excel in classification, they fall short in more practical tasks, such as human pose estimation. In this paper, we have experimentally demonstrated it and propose the generation of visually distinct images with identical human poses. We then propose a novel multi-positive contrastive learning, which optimally utilize the previously generated images to learn structural features of the human body. We term the entire learning pipeline as GenPoCCL. Despite using only less than 1% amount of data compared to current state-of-the-art method, GenPoCCL captures structural features of the human body more effectively, surpassing existing methods in a variety of human-centric perception tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes