HCCLJun 16, 2023

ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language Models

arXiv:2306.09649v311 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient multimodal interface development for app developers, offering a novel framework that simplifies creation of rich interactions, though it is incremental in improving existing multimodal approaches.

The paper tackles the problem of laborious development for complex multimodal interfaces by introducing ReactGenie, a framework that uses large language models to translate multimodal commands into a programming language, enabling developers to build applications quickly (under 2.5 hours on average) and end-users to complete tasks faster with less task load compared to traditional GUIs.

By combining voice and touch interactions, multimodal interfaces can surpass the efficiency of either modality alone. Traditional multimodal frameworks require laborious developer work to support rich multimodal commands where the user's multimodal command involves possibly exponential combinations of actions/function invocations. This paper presents ReactGenie, a programming framework that better separates multimodal input from the computational model to enable developers to create efficient and capable multimodal interfaces with ease. ReactGenie translates multimodal user commands into NLPL (Natural Language Programming Language), a programming language we created, using a neural semantic parser based on large-language models. The ReactGenie runtime interprets the parsed NLPL and composes primitives in the computational model to implement complex user commands. As a result, ReactGenie allows easy implementation and unprecedented richness in commands for end-users of multimodal apps. Our evaluation showed that 12 developers can learn and build a nontrivial ReactGenie application in under 2.5 hours on average. In addition, compared with a traditional GUI, end-users can complete tasks faster and with less task load using ReactGenie apps.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes