CVDec 16, 2023

Fusing Conditional Submodular GAN and Programmatic Weak Supervision

Kumar Shubham, Pranav Sastry, Prathosh AP

arXiv:2312.10366v12.83 citationsh-index: 3Has CodeAAAI

Originality Incremental advance

AI Analysis

This work addresses the problem of efficiently leveraging unlabeled data for researchers in machine learning, offering an incremental improvement over existing fusion methods.

The paper tackles the challenge of integrating Programmatic Weak Supervision (PWS) and generative models by addressing issues in WSGAN, such as non-class-specific latent factors and noisy label predictions, through a noise-aware classifier and submodular subset selection. It demonstrates improved performance on multiple datasets compared to state-of-the-art methods, though specific numerical gains are not detailed.

Programmatic Weak Supervision (PWS) and generative models serve as crucial tools that enable researchers to maximize the utility of existing datasets without resorting to laborious data gathering and manual annotation processes. PWS uses various weak supervision techniques to estimate the underlying class labels of data, while generative models primarily concentrate on sampling from the underlying distribution of the given dataset. Although these methods have the potential to complement each other, they have mostly been studied independently. Recently, WSGAN proposed a mechanism to fuse these two models. Their approach utilizes the discrete latent factors of InfoGAN to train the label model and leverages the class-dependent information of the label model to generate images of specific classes. However, the disentangled latent factors learned by InfoGAN might not necessarily be class-specific and could potentially affect the label model's accuracy. Moreover, prediction made by the label model is often noisy in nature and can have a detrimental impact on the quality of images generated by GAN. In our work, we address these challenges by (i) implementing a noise-aware classifier using the pseudo labels generated by the label model (ii) utilizing the noise-aware classifier's prediction to train the label model and generate class-conditional images. Additionally, we also investigate the effect of training the classifier with a subset of the dataset within a defined uncertainty budget on pseudo labels. We accomplish this by formalizing the subset selection problem as a submodular maximization objective with a knapsack constraint on the entropy of pseudo labels. We conduct experiments on multiple datasets and demonstrate the efficacy of our methods on several tasks vis-a-vis the current state-of-the-art methods.

View on arXiv PDF Code

Similar