Leveraging Large Language Models in Human-Robot Interaction: A Critical Analysis of Potential and Pitfalls
This work addresses the integration of LLMs/VLMs into socially assistive robots, which is an incremental analysis focusing on opportunities and risks in domains like education and healthcare.
The paper analyzes the potential and challenges of integrating large language models (LLM) and vision language models (VLM) into socially assistive robots (SARs) for human-robot interaction, based on a meta-study of over 250 papers, and outlines a pathway for responsible adoption.
The emergence of large language models (LLM) and, consequently, vision language models (VLM) has ignited new imaginations among robotics researchers. At this point, the range of applications to which LLM and VLM can be applied in human-robot interaction (HRI), particularly socially assistive robots (SARs), is unchartered territory. However, LLM and VLM present unprecedented opportunities and challenges for SAR integration. We aim to illuminate the opportunities and challenges when roboticists deploy LLM and VLM in SARs. First, we conducted a meta-study of more than 250 papers exploring 1) major robots in HRI research and 2) significant applications of SARs, emphasizing education, healthcare, and entertainment while addressing 3) societal norms and issues like trust, bias, and ethics that the robot developers must address. Then, we identified 4) critical components of a robot that LLM or VLM can replace while addressing the 5) benefits of integrating LLM into robot designs and the 6) risks involved. Finally, we outline a pathway for the responsible and effective adoption of LLM or VLM into SARs, and we close our discussion by offering caution regarding this deployment.