LG MLMay 29, 2019

G2R Bound: A Generalization Bound for Supervised Learning from GAN-Synthetic Data

Fu-Chieh Chang, Hao-Jen Wang, Chun-Nan Chou, Edward Y. Chang

arXiv:1905.12313v11.81 citations

Originality Synthesis-oriented

AI Analysis

This addresses the need for reliable supervised learning in privacy-sensitive or data-scarce scenarios, though it is incremental as it builds on existing generalization theory.

The paper tackles the problem of ensuring generalization when training classifiers on GAN-synthetic data, proposing a bound to measure the gap between synthetic training and real testing.

Performing supervised learning from the data synthesized by using Generative Adversarial Networks (GANs), dubbed GAN-synthetic data, has two important applications. First, GANs may generate more labeled training data, which may help improve classification accuracy. Second, in scenarios where real data cannot be released outside certain premises for privacy and/or security reasons, using GAN- synthetic data to conduct training is a plausible alternative. This paper proposes a generalization bound to guarantee the generalization capability of a classifier learning from GAN-synthetic data. This generalization bound helps developers gauge the generalization gap between learning from synthetic data and testing on real data, and can therefore provide the clues to improve the generalization capability.

View on arXiv PDF

Similar