Multimodal Controller for Generative Models
This provides a more efficient and flexible solution for class-conditional data generation, benefiting researchers and practitioners in generative modeling.
The paper tackles the problem of generating multimodal data from class labels without modifying backbone generative architectures, introducing a plug-and-play multimodal controller that improves image quality on benchmark datasets like CIFAR10, COIL100, and Omniglot.
Class-conditional generative models are crucial tools for data generation from user-specified class labels. Existing approaches for class-conditional generative models require nontrivial modifications of backbone generative architectures to model conditional information fed into the model. This paper introduces a plug-and-play module named `multimodal controller' to generate multimodal data without introducing additional learning parameters. In the absence of the controllers, our model reduces to non-conditional generative models. We test the efficacy of multimodal controllers on CIFAR10, COIL100, and Omniglot benchmark datasets. We demonstrate that multimodal controlled generative models (including VAE, PixelCNN, Glow, and GAN) can generate class-conditional images of significantly better quality when compared with conditional generative models. Moreover, we show that multimodal controlled models can also create novel modalities of images.