From A to Z: Supervised Transfer of Style and Content Using Deep Neural Network Generators
This addresses the challenge of single-image analogies for tasks like font generation, though it appears incremental as it builds on existing VAE methods.
The paper tackles the problem of generating a full set of stylistically similar images from a single input image by separating style from content, achieving a 22.4% lower dissimilarity to ground truth compared to state-of-the-art on a font generation task.
We propose a new neural network architecture for solving single-image analogies - the generation of an entire set of stylistically similar images from just a single input image. Solving this problem requires separating image style from content. Our network is a modified variational autoencoder (VAE) that supports supervised training of single-image analogies and in-network evaluation of outputs with a structured similarity objective that captures pixel covariances. On the challenging task of generating a 62-letter font from a single example letter we produce images with 22.4% lower dissimilarity to the ground truth than state-of-the-art.