Adjusting Image Attributes of Localized Regions with Low-level Dialogue
This work addresses the challenge for novices in image editing by introducing a low-level language interface, though it is incremental in applying dialogue systems to this domain.
The paper tackled the problem of ambiguous natural language instructions in image editing by developing a dialogue system that uses low-level commands, finding that 25% of users found it easy-to-use and identifying object segmentation as key to satisfaction.
Natural Language Image Editing (NLIE) aims to use natural language instructions to edit images. Since novices are inexperienced with image editing techniques, their instructions are often ambiguous and contain high-level abstractions that tend to correspond to complex editing steps to accomplish. Motivated by this inexperience aspect, we aim to smooth the learning curve by teaching the novices to edit images using low-level commanding terminologies. Towards this end, we develop a task-oriented dialogue system to investigate low-level instructions for NLIE. Our system grounds language on the level of edit operations, and suggests options for a user to choose from. Though compelled to express in low-level terms, a user evaluation shows that 25% of users found our system easy-to-use, resonating with our motivation. An analysis shows that users generally adapt to utilizing the proposed low-level language interface. In this study, we identify that object segmentation as the key factor to the user satisfaction. Our work demonstrates the advantages of the low-level, direct language-action mapping approach that can be applied to other problem domains beyond image editing such as audio editing or industrial design.