CVOct 5, 2020

A Benchmark and Baseline for Language-Driven Image Editing

arXiv:2010.02330v136 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more general and free-form image editing tools, particularly for photography novices, though it is incremental as it builds on existing language-driven editing tasks.

The authors tackled the problem of language-driven image editing by creating a dataset with local and global editing annotations and proposing a baseline method that predicts operation parameters, achieving strong performance on user data.

Language-driven image editing can significantly save the laborious image editing work and be friendly to the photography novice. However, most similar work can only deal with a specific image domain or can only do global retouching. To solve this new task, we first present a new language-driven image editing dataset that supports both local and global editing with editing operation and mask annotations. Besides, we also propose a baseline method that fully utilizes the annotation to solve this problem. Our new method treats each editing operation as a sub-module and can automatically predict operation parameters. Not only performing well on challenging user data, but such an approach is also highly interpretable. We believe our work, including both the benchmark and the baseline, will advance the image editing area towards a more general and free-form level.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes