CVMar 14, 2018

Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data

arXiv:1803.05137v115 citations
Originality Incremental advance
AI Analysis

This addresses the problem of data scarcity for machine learning practitioners in fields like computer vision, though it appears incremental as it builds on existing Data Programming with adversarial techniques.

The paper tackles the bottleneck of limited curated labeled data by introducing Adversarial Data Programming (ADP), a method that uses GANs to generate data and aggregated labels from weak labeling functions, and it outperforms state-of-the-art models on datasets like MNIST, CIFAR 10, and SVHN.

Paucity of large curated hand-labeled training data for every domain-of-interest forms a major bottleneck in the deployment of machine learning models in computer vision and other fields. Recent work (Data Programming) has shown how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time. In this work, we present Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label has given a set of weak labeling functions. We validated our method on the MNIST, Fashion MNIST, CIFAR 10 and SVHN datasets, and it outperformed many state-of-the-art models. We conducted extensive experiments to study its usefulness, as well as showed how the proposed ADP framework can be used for transfer learning as well as multi-task learning, where data from two domains are generated simultaneously using the framework along with the label information. Our future work will involve understanding the theoretical implications of this new framework from a game-theoretic perspective, as well as explore the performance of the method on more complex datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes