CPT: Controllable and Editable Design Variations with Language Models
This addresses scalability and personalization challenges in creative workflows for designers, though it appears incremental as it builds on existing language model techniques for a specific domain.
The paper tackles the manual and time-consuming process of generating diverse, high-quality designs by introducing a system that uses a language model (Creative Pre-trained Transformer) trained on design templates to produce editable design variations, resulting in semantically structured and stylistically coherent outputs that preserve internal consistency.
Designing visually diverse and high-quality designs remains a manual, time-consuming process, limiting scalability and personalization in creative workflows. We present a system for generating editable design variations using a decoder-only language model, the Creative Pre-trained Transformer (CPT), trained to predict visual style attributes in design templates. At the core of our approach is a new representation called Creative Markup Language (CML), a compact, machine-learning-friendly format that captures canvas-level structure, page layout, and element-level details (text, images, and vector graphics), including both content and style. We fine-tune CPT on a large corpus of design templates authored by professional designers, enabling it to learn meaningful, context-aware predictions for attributes such as color schemes and font choices. The model produces semantically structured and stylistically coherent outputs, preserving internal consistency across elements. Unlike generative image models, our system yields fully editable design documents rather than pixel-only images, allowing users to iterate and personalize within a design editor. In experiments, our approach generates contextual color and font variations for existing templates and shows promise in adjusting layouts while maintaining design principles.