Grounded Complex Task Segmentation for Conversational Assistants
This work addresses the challenge of making complex instructions more manageable for users of conversational assistants, though it is incremental as it focuses on the specific domain of recipes.
The paper tackled the problem of converting structured reading instructions into conversational steps for complex tasks in recipes, showing that a token-based Transformer approach improved 86% of tasks for conversational suitability.
Following complex instructions in conversational assistants can be quite daunting due to the shorter attention and memory spans when compared to reading the same instructions. Hence, when conversational assistants walk users through the steps of complex tasks, there is a need to structure the task into manageable pieces of information of the right length and complexity. In this paper, we tackle the recipes domain and convert reading structured instructions into conversational structured ones. We annotated the structure of instructions according to a conversational scenario, which provided insights into what is expected in this setting. To computationally model the conversational step's characteristics, we tested various Transformer-based architectures, showing that a token-based approach delivers the best results. A further user study showed that users tend to favor steps of manageable complexity and length, and that the proposed methodology can improve the original web-based instructional text. Specifically, 86% of the evaluated tasks were improved from a conversational suitability point of view.