Chien-Ming Lin

46.0HCMay 6

IntenBot: Flexible and Imprecise Multimodal Input for LLMs to Understand User Intentions for Casual and Human-Like HRI

Yen-Ting Liu, Chiu-Hsuan Wang, TzuLing Chen et al.

In natural human-to-human communication, multimodal user input is typically used to supplement explicit and complement implicit voice commands, with casualness allowing for flexible input modality combinations and tolerance for imprecise input data. For example, saying "I want that." with a casual glance at a bottle of water is clear enough in human-to-human communication as an implicit voice command accompanied by gaze and/or gestures, rather than an explicit one. To enable such a human-like interaction in human-robot interaction (HRI), we propose a system, IntenBot, to understand user intentions from flexible and imprecise multimodal input, including voice, gaze, and finger-pointing, in XR. The disambiguation capability of large language models (LLMs) is used to filter out irrelevant input modalities and imprecise input data, generating potential instructions for user confirmation. The flexible and imprecise multimodal input enables casual, human-like interaction with robots, reducing time, effort, and attention, and could also be used as non-voice input. We conducted an informative user behavior study in a simulated environment to understand users' natural be- havior in flexibly interacting with a robot using multimodal input and to obtain appropriate angle range parameters for gaze and finger-pointing. An XR study was then performed to evaluate the performance of IntenBot, compared with other methods. We also deployed IntenBot on a physical robot to showcase its real-world applications.

LGDec 31, 2020

Maximum-Likelihood Quantum State Tomography by Soft-Bayes

Chien-Ming Lin, Yu-Ming Hsu, Yen-Huan Li

Quantum state tomography (QST), the task of estimating an unknown quantum state given measurement outcomes, is essential to building reliable quantum computing devices. Whereas computing the maximum-likelihood (ML) estimate corresponds to solving a finite-sum convex optimization problem, the objective function is not smooth nor Lipschitz, so most existing convex optimization methods lack sample complexity guarantees; moreover, both the sample size and dimension grow exponentially with the number of qubits in a QST experiment, so a desired algorithm should be highly scalable with respect to the dimension and sample size, just like stochastic gradient descent. In this paper, we propose a stochastic first-order algorithm that computes an $\varepsilon$-approximate ML estimate in $O( ( D \log D ) / \varepsilon ^ 2 )$ iterations with $O( D^3 )$ per-iteration time complexity, where $D$ denotes the dimension of the unknown quantum state and $\varepsilon$ denotes the optimization error. Our algorithm is an extension of Soft-Bayes to the quantum setup.

Chien-Ming Lin

2 Papers