VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
This work addresses the challenge of natural language-driven vector graphic editing for researchers, but it is incremental as it focuses on dataset creation and benchmarking.
The authors tackled the problem of instruction-guided vector image editing by introducing a large-scale dataset of over 270,000 SVG-image and natural-language instruction pairs, and initial experiments showed that current methods struggle with accurate and valid edits.
We introduce a large-scale dataset for instruction-guided vector image editing, consisting of over 270,000 pairs of SVG images paired with natural language edit instructions. Our dataset enables training and evaluation of models that modify vector graphics based on textual commands. We describe the data collection process, including image pairing via CLIP similarity and instruction generation with vision-language models. Initial experiments with state-of-the-art large language models reveal that current methods struggle to produce accurate and valid edits, underscoring the challenge of this task. To foster research in natural language-driven vector graphic generation and editing, we make our resources created within this work publicly available.