Character Generation through Self-Supervised Vectorization
This work addresses image generation for domains like character recognition by introducing a novel vector-based approach that is incremental in combining reinforcement learning with self-supervised training.
The paper tackles the problem of generating images by using a self-supervised drawing agent that operates on stroke-level representations, producing raster images with minimal strokes and dynamic stopping decisions, achieving successful results on generation and parsing tasks using MNIST and Omniglot datasets without stroke-level supervision.
The prevalent approach in self-supervised image generation is to operate on pixel level representations. While this approach can produce high quality images, it cannot benefit from the simplicity and innate quality of vectorization. Here we present a drawing agent that operates on stroke-level representation of images. At each time step, the agent first assesses the current canvas and decides whether to stop or keep drawing. When a 'draw' decision is made, the agent outputs a program indicating the stroke to be drawn. As a result, it produces a final raster image by drawing the strokes on a canvas, using a minimal number of strokes and dynamically deciding when to stop. We train our agent through reinforcement learning on MNIST and Omniglot datasets for unconditional generation and parsing (reconstruction) tasks. We utilize our parsing agent for exemplar generation and type conditioned concept generation in Omniglot challenge without any further training. We present successful results on all three generation tasks and the parsing task. Crucially, we do not need any stroke-level or vector supervision; we only use raster images for training.