CLDec 3, 2018

A System for Automated Image Editing from Natural Language Commands

arXiv:1812.01083v15 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of making image editing more accessible through natural language interfaces, but it is incremental as it builds on existing methods for text-to-action mapping.

The paper tackles the problem of automatically editing images based on natural language commands by developing a framework that maps user requests to executable editing actions, using a corpus of over 6000 crowdsourced image-text pairs and finding that LSTM, SVM, and bidirectional LSTM-CRF models perform best for detecting actions and entities.

This work presents the task of modifying images in an image editing program using natural language written commands. We utilize a corpus of over 6000 image edit text requests to alter real world images collected via crowdsourcing. A novel framework composed of actions and entities to map a user's natural language request to executable commands in an image editing program is described. We resolve previously labeled annotator disagreement through a voting process and complete annotation of the corpus. We experimented with different machine learning models and found that the LSTM, the SVM, and the bidirectional LSTM-CRF joint models are the best to detect image editing actions and associated entities in a given utterance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes