Mining Multi-Label Samples from Single Positive Labels
This addresses the annotation cost barrier for multi-label datasets in real-world scenarios, offering an incremental improvement by adapting existing GANs.
The paper tackles the problem of generating multi-label data when only single positive labels are available, proposing a sampling method that enables existing GANs to produce high-quality multi-label data with minimal annotation cost, achieving results comparable to models trained on fully annotated datasets.
Conditional generative adversarial networks (cGANs) have shown superior results in class-conditional generation tasks. To simultaneously control multiple conditions, cGANs require multi-label training datasets, where multiple labels can be assigned to each data instance. Nevertheless, the tremendous annotation cost limits the accessibility of multi-label datasets in real-world scenarios. Therefore, in this study we explore the practical setting called the single positive setting, where each data instance is annotated by only one positive label with no explicit negative labels. To generate multi-label data in the single positive setting, we propose a novel sampling approach called single-to-multi-label (S2M) sampling, based on the Markov chain Monte Carlo method. As a widely applicable "add-on" method, our proposed S2M sampling method enables existing unconditional and conditional GANs to draw high-quality multi-label data with a minimal annotation cost. Extensive experiments on real image datasets verify the effectiveness and correctness of our method, even when compared to a model trained with fully annotated datasets.