NE LGNov 30, 2018

Lipizzaner: A System That Scales Robust Generative Adversarial Network Training

Tom Schmiedlechner, Ignavier Ng Zhi Yong, Abdullah Al-Dujaili, Erik Hemberg, Una-May O'Reilly

arXiv:1811.12843v114.021 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of robust GAN training for machine learning engineers, though it is incremental as it builds on existing coevolutionary methods with a focus on distribution.

The paper tackles the problem of training Generative Adversarial Networks (GANs) robustly by introducing Lipizzaner, a distributed system that overcomes convergence pathologies like mode and discriminator collapse, leading to improved model performance with scalable communication overhead.

GANs are difficult to train due to convergence pathologies such as mode and discriminator collapse. We introduce Lipizzaner, an open source software system that allows machine learning engineers to train GANs in a distributed and robust way. Lipizzaner distributes a competitive coevolutionary algorithm which, by virtue of dual, adapting, generator and discriminator populations, is robust to collapses. The algorithm is well suited to efficient distribution because it uses a spatial grid abstraction. Training is local to each cell and strong intermediate training results are exchanged among overlapping neighborhoods allowing high performing solutions to propagate and improve with more rounds of training. Experiments on common image datasets overcome critical collapses. Communication overhead scales linearly when increasing the number of compute instances and we observe that increasing scale leads to improved model performance.

View on arXiv PDF Code

Similar